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ABSTRACT 



This report describes a new automated process that 
pioneers full-scale operational use of subject switching by the NASA 
(National Aeronautics and Space Administration) Scientific and 
Technical Information (STI) Facility. The subject switching process 
routinely translates machine-readable subject terms from one 
controlled vocabulary into the equivalent terms of another controlled 




maintain the system after it is built. A description of the NASA STI 
Facility's experiences with their first input vocabulary, that of the 



Defense Technical Information Center (DTIC), is included. Following a 
preface .and executive summary, this report is divided into seven 
sections: (1) introduction (purpose, significance, definition of the 
NASA Lexical Dictionary, scope of NLD, preliminary results. 



presentation, and project personnel); (2) system description; (3) 
history; (4) procedures for building a lexical dictionary; (5) da 



data 



file maintenance; (6) results and conclusions; and (7) summary. A 
glossary, two appendices, and references are included. (THC) 
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PREFACE 



This report describes a new automated process that-pioneers full-scale 
operational use of subject switching by the NASA Scientific and Technical 
Information (STI) Facility^ The subject ^switching process routinely 
translates machine-readable subject terms from one controlled vocabulary 
into the equivalent terms of another controlled vocabulary. To do subject' 
switching, we use a system called the NASA Lexical Dictionary (NLD). the 
report also describes the NLD, how to build a lexical dictionary system, 
what resources are needed, and how to maintain the system after it's built. 
'The experience of the NASA STI Facility with their first input vocabulary, / 
that of the Defense Technical Information Center (DTIC), is included in the 
section labelcv: HISTORY, , " j 

We woi'^d 1 ike to acknowledge the help given to this project by 
personnel at DTIC. Without their cooperation the construction of 'the f)tLD 
would have been mure difficult and costly. 

Work^^ on the NLD, done by Planning Research Corporation/Goverriment 
Information Systems, was supported by the National Aeronautics and /Space 
Administration's Scientific and Technical Information Branch under cohtract 
NASw-3330. The period of performance covered by this report is from 
November 2, 198r-to December' 31, 1983. / 
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EXECUTIVE SUMMARY 

The NASA Lexical Dictionary (NLD)j i system that automatically 
translates input subject terms to those of NASA, was developed in four 
phases. Phase One provided Phrase Matching, a context -sensitive 
word-matching process that matches input phrase words with any NASA 
Thesaurus pq/sting (i.e. index) term or Use reference. Other Use references 
have been added to enable the matching of synonyms, variant spellings, and 
some words with the same root. Phase Two provided the capability of 
translating, any individual DTIC term to one or more NASA terms having the 
same meaning. Phase Three provided NASA terms having equivalent concepts 
for two or more DTIC terms, i.e. coordinations of DTIC terms. Phase Four 
was concerned with indexer feedback and maintenance. Although the original 
NLD construction invbl\fed much manual data entry, ways were f.oUnd to 
automate ne,arly all but the intellectual decision-making processes. In 
addition to finding improved ways to construct a lexical dictionary, new 
applications for the NLD have been found and, are being dgvelcped. 
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INTRODUCTION 



Purpose 

' The purpose of the'NLD is to minimize the indexijig of documents 
already indexed by ^another agency. Approximately half of the report 
literature added to the NASA STl' Facility data bases each year has been 
previously cataloged, abstracted, and indexed by another agency. See 
Figure 1. Much of this previously processed material is received at the 
NASA STI Facil ity (hereafter referred to as- the Facil ity) in 
machine-readable form on magnetic tape. Tlie facility*s objective is to 
accept as much as possible of this work §nd the NLD is part of the overall 
effort. The NLD accepts,' in machine^rreadable form, -words and phrases from 
the document record created by another agency and translates them into 
valid NASA index lerms. The words and phrases that are run through the NLD 
are normally terms from a controlled vocabuTary such as DTIC's. However, 
it is possible to take a title or a line or two' of text and treat it as if 
it were a long phrase. While the Access Routine was not designed to select 
phrases from text, it can be used to generate posting terms from a-limited 
amount of text such as a title, title supplement, note of content, or from 
words and phrases from any machine-readable source. The terrns generated in 
this manner mDst'be reviewed and may need to be edited by the indexer. 



Significance 

The NLD Subject Switching system is a flexible tool. It could be 
imp^lemented as a time-saving device by any organization that accessions and 
reindexes documents that have been indexed by another organization. The 
components of the^ NLD provide the basis on which to build either a system 
for automatically indexing text,- pr a system for the automatic translation 
of index terms from any controlled vdcabulary to another. 

Definition -ef the NASA Lexical Dictionary 

A - lexical dictionary has been defined in several . ways. 
Paul H. Klingbi€l, who initiated the NLD, defines it two ways in his latest 
work (ref. l;: as "a phrase structure rewrite system" and as "a matrix." 
Roxanne Newton defined the NLD (in the "System Overview" written for 
Facility use) as "a translation device." These different descriptions 
represent different points of view. To a mathematician, a lexical 
dictionary is a matrix; to a linguist, it's a grammar; to an accountant,, 
the system may resemble a spreadsheet; but to those dealing with operating 
systems, the lexical dictionary is a translation device. June Silvester 
adds that, to the indexer, the lexical dictionary is a tool. 

This report addresses itself primarily to th^e operating system 
definition— that is, that the lexical dictionary is a translation device, 
and secondarily to the idea that the lexical dictionary is an indexer tool. 
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Scope of the NLD 

The Original NLD project, which aimed to translate DTIC indexing to 
NASA indexing, involved two procedures. First, the translation of the 
concept of ^very individual term in^DTIC*s controlled vocabulary to one or 
more NASA terms that express the same concept. Second, the translation of 
two or more coordinated DTIC terms to the NASA tenn or , terms expressing the 
same concept. If the concept of coordinated terms required more than 'one 
NASA term for its translation, the NASA terms must be different from the 
DTIC terms. For example: 

DTIC NASA 

administrative personnel management, personnel 

aeroelasticity aeroelasticity • 

?ircraft;drones drone aircraft 

anny planning;anny research project planning, armed forces (United 

States) 

The DTIC terms and their NASA equivalents are given to the indexers for 
review. Indexer review consists 'of four functions: 

• Accept or reject NASA terms listed by the NLD. 

• Add any terms not listed by the NLD that are necessary for the 
NASA environment., 

4 Indicate which terms are major terms. 

t Recommend NLD ^changes such as: ways of improving translations, 
deleting irrelevant translations, or adding terms that should be 

coordinated to translate to a single term. 

* 

Preliminary Results 

Management expected that the use. of the .NLD would shorten indexing, 
time, and it has. Based on a questionnaire that was used as an evaluation 
to(^l, the NLD sa^es at least three minutes per document indexed. 

'The project personnel expected that the use of the NLD would make 
indexing more of a decision-i: aking process and less of a lookup job — 
lookup the term, lookup the word form, lookup the spelling, etc. The NLD 
has done that, tqo. 

The indexers had mixed expectations. Some feared that the NLD would 
eliminate their jobs. The NLD has not done that. Many terms are context 
sensitive and to maintain high quality indexing at the Facility, the 
decision was. made to require indexer review. The NLD provides a 
team-approved* translation of any input word or phrase. The indexer 
provides a check on context sensitive selectiolis, a choice of pertinent 
NLD-suggested terms, and any additional index terms needed to serve the 
NASA environment. We feel that this combination makes the NLD an expert 
system (see section on SYSTEM DESCRIPTION). ' 



Indexers who understand best how the NLD operates and what its output 
means are the most enthusiastic. Indexers must be. trained to achieve 
optimal use of the NLD, but the training required is minimal. 



Presentation 

This report "details the- resources required for implementing the NiO 
Subject Switching System and provides a step-by-step implementation plan.* 
It includes a system overview, the NASA experience with DTIC-to-NASA 
Subject Switching, the three-phase implementation plan which we followed, 
and some recommendations for doing it more easily. -It aTso describes 
system maintenance— what is involved and how to do. it. Finally, we discuss 
the benefits, problems, and future of the NASA Lexical Dictionary. 

• 

Project Personnel 

The NLD has had three project directors since its inception:' Paul H. 
Klingbiel., Roxanne Newton, and . former analyst June .P. Silvester. 
Progranming has been provided by Elaine Sellman, succeeded by Duchesne 
"Duke" Clark, and Patricia Carroll. They were assisted by Midori Keech and 
Nina Kit. Po'sting term translations were clone by senior retrieval analyst 
Edna Fleek, lexicographer Ron Buchan, abstracting/indexing supervisor 
Jacqueline Streeks, Klingbiel, Newton, and Silvester. The project was also 
supported by the publications and data entry staff. 



*The NLD team for this task consisted of the project director, 
analysts, lexicographer, and the abstracting/indexing supervisor. 
Translations done by any team member were reviewed by other team members. 
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SYSTEM DESCRIPTION 



Significance Within the Larger System ^ * ' 

As we have stated, the lexical dictionary is a tool. Although it is 
used to automate the translation of one agency's vocabulary to that of 
another, the lexical dictionary does more. Its use- can alter indexing 
procedures by* relieving indexers of purely mechanical tasks. The 
maintenance of the lexical dictionary stimulates increased communication 
and cooperation between agencies involved. Lexical dictionary construction 
brings a new awareness of the shortcomings and strengths of various 
thesauri and the need to improve" terminological standards within the 
government. Thinking and talking about ways to communicate better is 
better communication—or at least some communication where often little or 
none existed before. 

Expert System C9ncept 

General Description . The lexical dictionary might' be classed as an 
expert system, although it is a somewhat rudimentary one. By an expert* 
system we mean a system 'that can emulate human reasoning* William B-. 
Gevarter describes the components of an expert system as follows: 

(1) a knowledge base (or knowledge source) of domain facts and 
heuristics associated with the problem; 

(2) an inference procedure (or 'control- structure) for utilizing the 
knowledge base in the solution of the problem; 

(3) a working memory~"gTobal ^ata baseV-for keeping track of the 
problem' status, the input ^<|ata for the particular problem, and 
the relevant history of what has thus far been done. 

(ref. 2, p. 80). 

The NLD has data files created by NASA vocabulary experts. The files --^ 
include logic codes. These files are constantly added to, corrected, and 
improved by the experts who created the original knowledge base and by 
others who interface with the -systeiii; The logic codes in the files provide 
direction to the Access Routine (see subsection on System Components, ' 
Lexical Dictionary Access Routine). These files with their domain facts 
satisfy the requirement for the knowledge base and heuristics associated 
with the problem* 

. The Access Routine approximates an inference* procedure* by using the 
NLD files or knowledge base in the solution of the problem,. The problem is 
defined as the determination of acceptable combinations of wa»:ds. from an 
input source to be translated into authorized NASA index terms. 

The working memory for the NLD traces input material through the 
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translation process. The memory liiakes it possible to print lists of input 
terms with their translations, input terms with partial translations, input 
terms with no translations, and statistics about the logic codes used. 
Thus the NLD meets the basic criteria for an expert system. 

Another way in which the NLD; system might" be considered expert is its 
use* of the best of two approaches: Indexer Simulation and Indexer 
Feedback. 

Indexer Simulation. ; A lexical dictionary simulates the indexer*s 
translation of input terms into target vocabulary terms. The printout 
provided by the lexical^ dictionary lists the terms from the " input 
vocabulary antls^e terms from the target vocabulary, side by side, ready 
for indexer reviS^rand selection. This process uses the computer to do the 
repetitious, uninteresting, time-consuming indexer tasks that are largely 
mechanical, and provides ' expert, consistent translations for a final review 
by humans. ,1 

# * 

Indexer Feedback . The indexer reviews the suggested target terms and 
provides the lexical dictionary personnel With recommendations for improved 
translations. These recommendations are studied, usually approved, and the 
needed changes are made. These changes improve the indexer simulation for 
subsequent, runs of the program, but the sysl^em uses humans to make 
decisions not yet-^ posscible with available software. Humans also upgrade 
the' computer system which results in "^impraved indexer simulation for 
subsequent runs of the program. > 

As in the old chicken and egg go-round, each produces the other. Both 
are essential , and they work in a kind of endless* f pop. Together the 
system has the best of both automatic and human ir^'t, and it keeps 
building on itself, hence an expert system. 



System Functions 

As stated before/ the NASA^ Lexical Dictionary system is a translation 
device. . Th(j NLD translates words and phrases from machine-readable input 
material into corresponding NASA Thesaurus posting terms.' The mode of 
operation, either Phrase Matching or Subject Switching, depends upon the 
type of input material being processed. . . ' ^ 

• The Phrase Matching mode is a general purpose matching routine 
which attempts to' find context sensitive word-by-word matches 
between any input phrases and NASA posting terms or Use 
references. Matches may be complete or partial. In some cases, 
no match will be found. For example: 



Input Phrase 



NASA Posting Term($) 



Salaries No match found 

Fuel consumption Fuel consumption 

Inorganic acids Acids * 

Cellulose acetates Cellulose, Acetates 

Chance-Vought military Chance-Vought aircraft, 

- aircraft Military aircraft 

<• 

The Subject Switching mode is a special purpdse^ rcutine which 
translates the concepts expressed by the posting terms assigned 
to a document by a particular contributing source (such as DTK) 
into the equivalent concept expressed in NASA posting terms* 
Subject Switching treats each input posting term as a unit, in 
contrast to Phras? Matching where the unit fs the word. A unique 
translation table is built for the posting terms of each 
contributing source. An entry is^ created for every contributed 
posting term,- but in* some cases, the translation may indicate 
that the term is out of scope or not able to be translated. For 
example: 

DTIC Posting Term(s) NASA Posting Term(s) 

Regiment level organization NIS (Not In Scope for 

NASA) 

Complementary metal oxide CMOS 
semiconductors 

Internal combustion engine Engine noise. Internal 

noise combustion engines 

Abrasion, Resistance Abrasion resistance 

Self treatment . 00 {fio NASA translation) 



A more detailed explanation of these two translation modes is provided in 
the section on DATA FILE MAINTENANCE. 

The Lexical Dictionary system can be u^^d as the' basis for an 
automatic indexing system to process text fields, such as abstracts. 
Automatic indexing, if it were instituted, would: require the addition of a 
•word recognition file to assign syntax codes and a prograni or programs to. 
break text into logical words and phrases for processing. DTIC uses a 
similar system for automatic indexing of several data bases. 

*• i» 

System Components 

The NLD has three major components: data files which act as 
translation tables, an Access Routine which manipulates the input words and 
phrases, matches them against the data files, and returns the NASA 
translation to the application program, and applications programs Vhich 
call the Access Routine. 

Figure 2 gives an overview of the NLD system operation, and this 
sectit)n will describe briefly the three NLD components. 



15 



Figure 2 

Overview of Lexical Dictionary System Operation 
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Data Files . The Lexical Dictionary system employs two types of Virtual 
Storage Access Method (VSAM)' files: 

• A general purpose Phrase Matching file and 

• special purpose Subject Switching files for the" controlled 
vocabulary (thesaurus terms) of each contributing source. 

The file organization and record layout for both types of files are the 
same. Each NLD file record consists of the following fields: 

• Key / 

Each key is unique and consists of terms that may be encountered 
in the input material. The key, can consist of a single element, 
followed by a semicolon and two 'Zeros (;00), or of multiple 
elements separated by semicolons (;). In the Phrase Matching 
file, these elements are the individual words that make up the 
target vocabulary posting terms or Use references. In the 
Subject, Switching file, each element is an entire posting term 
from the vocabulary . of the contributing source for that file. 
Terms may be single or multiple words. 

• Logic Code 

The' Logic Code is a one character code that indicates how the key 
is to be processed. Single element keys are assigned one of the 
following logic codes: 

E - (Equal) The key translates to a single posting term 
that is.identical to the key. 

C - (Change) The key translates to a single posting term 
that is different from the key_. 

L- (List) The key translates to multiple posting terms 
that should be used in combination. 

I - (Indexer Choice) The translation of the key is context 
dependent. The meaning appropriate for the document at 
hand must be selected and a choice of posting terms is 
offered. 

0 - No translation is available for the key. 

When there are multiple elements in the 4(ey, the .logic code T 
(Table)vis always used. 

t Posting Term 

The posting term field contains the NASA posting term or terms to 
which the key is to be translated. The field may also contain 
the following special symbols, which serve as an aid to Indexers: 

^ (? - NASA posting term is an array or ambiguous term. 
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NASA posting tern is broader than the contributing 
source term -in the key. 

- NASA posting term^has narrower termSothat the 
indexer should consider. These are terms 
that the contributing source dofes not have. 

- Indexer should choose one or more of the NASA 
posting terms as appropriate. 

- No appropriate NASA translation is available. 



NIS - The, contributing source term is NOT IN SCOPE for 
NASA. 

In some cases, there are more than two elements in a key. NLD System 
processing requires that intermediate records be created Vhich build to 
these multi-element keys. The first entry will consist of ^ the first two 
elements-. Each successive entry will add one more element until the entire 
phrase is complete. Since the intermediate keys do not have translations, 
the posting term^ fields for these records contain a special symbol as a 
place holder. For example: 

Logic Code Key Posting Term 

T Body;Centered * 

T Body; Centered; Cubic ** ' 

T Body;Centered;Cubic;Lattices Body centered cubic lattices 

Samples of records fron) the Phase Matching file and the Subject Switching 
file are shown below: 

Phrase Matching File Sample . . " 

Logic Code Key Posting Term 

E ' Bleeding;00 Bleeding 

C Blends ;00 . Mixtures 

E Blight;00 Blight 

T . Blind ;Landing Blind landing 

T. Block;Band Block band 

(Logic codes I and 0; and symbols >,?,+, 00, and NIS are not normaTlv 
used in the Phrase Matching file.) 

Subject Switching File Sample 

Logic Code . Key * Posting Tenn(s) 

E Filters ;00 Filters 

E Financial management ;00 Financial management 

C Fingernail s;00 Fingers 
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Subject Switching File Sample (continued) 
Logic Code Key Posting Temi(s). 

0 " Fingerprint recognition;00 00 ' . • 
E - Fins;00 Fins+ 

L Fire alarm systems ;00 Fires, Warning systems 

1 * Fire protection ;00 Fire prevention?, 

Fi reproofing? 

T Floating bodies;Sea ice Ice floes 

Lexical Dictionary Access Routine ^ The NLD Access Routine is a 
general purpose program that accesses ^ the Lexical Dictionary files. Its 
product is a list of index terms from the NASA Thesaurus which was the 
target vocabulary. 

The Access Routine never operates independently; it is always called by an 
application program. The application program passes the Access Routine two 
things: ^ 

• a code that indicates whethey" the Phrase Matching or Subject 
^ Switching mode should be employed and 

• a character string that is either a word or phrase for Phrase 
Matching or the set of posting terms assigned to a- record by a 
contributing source for Subject Switching* ' 

As the first processing step, the Access Routine creates an array from the- 
input character string* For Phase Matching, each word in the phrase is 
treated as an individual element, and the words are left in the natural 
order of the phrase. For Subject Switching, each posting term (which may 
be single word or multiple word) is treated as an element, and the posting 
terms are sorted in alphabetical order* - 

The following examples^ show a Phrase Matching and Subject Switching 
input array: 

Phrase Matching Subject Switching 

Input phrase: Engine Endurance Input OTIC Posting Terms: Engines, 
Testing Research Laboratories Laboratory Tests, Endurance (General), 

Laboratories 

Phrase Matching Array: Subject Switching Array: 

Engine Endurance (General) 

Endurance Engines 
Testing Laboratories 
Research Laboratory Tests 

Laboratories >^ 

Aside from the initial difference in creating the input, array, processing 
by the Access Routine is basically the same for the Phrase Matching and 
Subject Switching modes. A general description of this processing may be 
found in Appendix A. 
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Application Interface Programs , The NLD system is designed so that 
the avVpiication program determines the translation mode to be used and the 
files* to be accessed. The. Access Routine performs a standard processing 
routine based on these requirements and returns all matches that it finds 
to the application program. The application program determines which of 
the matches will be used. Because of this design, adding new applications 
or modifying requirements of existing applications does not generally 
require changes to the NLD system itself. Normally only the application 
program must be created or modified. 
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HISTORY 



DTIC's Rola 



Paul Klingbiel, first director of the NLD Project, had been active for 
18 years in linguistic resear,fch at DTIC. While there, he had initiated a 
lexical dictionary which be'came part of DTIC's machine-aided indexing 
system. ' 

NASA had been studying /methods of reducing duplication of work done by 
other agencies. In 1981, /it was decided to move ahead with plans for. a 
NASA Le}<ical Dictionary, designed to switch automatically the subject terms 
selected by DTIC's indexey's to NASA terminology. 

Klingbiel, by then r'etired from DTIC, agreed to organize the project. 
Copies of the lexical dictionary software were obtained from DTIC, "and 
programmer Elaine Sellman began a study of NLD requirements. 

DTIC's programs w/e written in COBOL for a UNIVAC mainframe while the 
Facility used a differ;ent programming language, PLl,*and an IBM mainframe. 
So, although the DTIO software was available, it served primarily as an 
example and the basis for the new NLD 'programs. 

A tape of DTIci lexical dictionary file also was obtained. This was 
used to determine How DTIC would translate NASA posting terms into DTIC 
posting terms and -'was helpfu) in constructing entries that translated 
coordinations of DT/IC terms into single NASA terms. 



NASA KWOC and Dat^^ Entry 

Klingbiel bq'gan the NLD- with a list of NASA posting terms in a special 
Key .Words Out of Context (KWOC) format. 'A kWOC listing had been use'd ^t 
DTIC to review/ and correct inconsistencies that had entered into the 
Natural Language Database. By starting the NLD with a KWOC printout of all 
■of NASA's posting terms and Use references, the problems experienced at 
DTIC were avoided. In fact, the KWOC became the basic tool for coding NLD 
entries. (See Figure 3 for a sample page of the NASA KWOC.) Column*! 
lists the unique words in the NASA controlled vocabulary in' alphabetical 
order. Columh 2 shows all NASA terms and Use references that are in the 
^Thesaurus and that contain the word in column 1. Column 3 lists- only NASA 
Posting terms. These are either the same tenns that appear in column 2 or 
authorized NASA posting terms that are to be used for those in column 2. 

Ent>l^s for the Lexical dictionary were, selected from columh 2. Only 
entries tha^t\ began with the word in column 1 -were selected; all of the 
others in that^^array were selected for coding as they appeared in other 
sections of the alphabet where the initial word in column 2 and the unique 
word in column 1 matched. 

For example,. in'Flwe 3, note the term OPERATIONS in the second 
• column. It matches the won^ -OPERATIONS in the first column and should be 
posted to ,the term appearing >q^column 3, namely OPERATIONS. The first 
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KWOC Of NASA THESAURUS AND UdE REFERENCES 



OPERATIONAL 



OPERATIONAL A^IPLIFIERS 

OPERATIONAL CALCULUS 

OPERATIONAL HAZARDS 

OPERATIONAL PROBLEMS 

TIROS OPERATIONAL SATELLITE SYSTEM 

OPERATIONS 

AIR DROP OPERATIONS ' 
AIRLINE OPERATIONS 
FCIGHT OPERATIONS 
LOADING OPERATIONS 
MILITARY OPERATIONS 
OPERATIONS . 
OPERATIONS RESEARCH 
PREFLIGHT OPERATIONS 
RESCUE OPERATIONS 

GEOSTATIONARY OPERATL ENVIRON SATELLITE B 

BERGMAN OPERATOR 
OPERATOR PERFORMANCE 
STURM- LIOUVILLE OPERATOR 

DIFFERENTIAL OPERATORS 

FREDHOLM OPERATORS 

LAPLACE OPERATORS 
OPERATORS 

OPERATORS (MATHEMATICS) 
OPERATORS (PERSONNEL) 

OPHTHALMODYNAMOMETRY 

OPHTHALMOLOGY 

OPIK THEORY 

OPOSSUM 

MINURACK OPTICAL TRACKING SYSTEM 
OPTICAL ABSORPTION 

OPTICAL ACTIVITY 
OPTICAL AMPLIFIERS 
OPTICAL COf^l^lUNICATlON 
OPTICAL CORRECflON PROCEDURE 
OPTICAL COUNTERMEA'SURES 
n)>TTCAL COUPLING 
0/>rjCAL DATA PROCESSING 
OPTICAL DATA STORAGE MATERIALS 
OPULAL DENSITY 
OPTICAL DEPOI AP12ATI0N 
OPTICAL EMISSION 
OPTICAL EMlSSlt^N SPECTROSCOPY 
OPTICAL EQUIPMENT 



OPERATL 
OPERATOR 

OPERATORS 



% 

0 PH7 H ALMOD YNAMOME TR Y 
OPHTHALMOLOGY 
OPIK 
0P0SSU4 
OPTICAL 



OPERATIONAL AMPLIFIERS 
OPERATIONAL CALCULUS 
OPERATICNAL HAZARDS 
OPERATIONAL PROBLEMS ' 
TIROS OPERATIONAL SATELLITE SYSTEM 

AIR DROP OPERATIONS 
AIRLINE OPERATIONS 
FLIGHT OPERATIONS 
- LOADING OPERATIONS 
MILITARY OPERATIONS 
OPERATIONS 
OPERATIONS RESEARCH 
PREFLIGHT OPERATIONS 
RESCUE OPERATIONS 

GOES B (NOAA) 

BERGMAN OPERATOR 
OPERATOR PERFORMANCE 
STURM- LIOUVILLE THEORY 

DIFFERENTIAL EQUATIONS 
OPERATORS (MATHEMATICS) 
FREDHOLM EQUATIONS 
OPERATORS (MATHEMATICS) 
LAPLACE TRANSFORMATION 
OPERATORS 

OPERATORS (MATHEMATICS) 
OPERATORS (PERSONNEL) 

OPH THA LMODYN AMOMET RY 

OPHTHALI^IOLOGY 

OPIK THEORY 

OPOSSUM 

MINITRACK SYSTEM 
ELECTROMAGNETIC ABSORPTION 
LIGHT TRANSMISSION 
OPTICAL ACTIVITY 
LIGHT AMPLIFIERS 
OPTICAL COMMUNICATION 
OPTICAL CORRECTION PROCEDURE 
OPTICAL COUNTERMEASURES 
OPTICAL COUPLING 
OPTICAL DATA PROCESSING 
OPTICAL DATA STORAGE MATERIALS 
OPTICAL DENSITY 
OPTICAL DEPOLARIZATION 
LIGHT EMISSION 

OPTICAL EMISSION SPECTROSCOPY 
OPTICAL EQUIPMENT 
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word of the term immediately following OPERATIONS, i.e*. OPERATIONS 
RESEARCH, also matches the word in column 1, and this item should be posted 
to the term appearing on the corresponding line in column 3, *i.e. 
OPERATIONS RESEARCH. 

The KWOC listing also was used to determine the proper logic code. In 
the c^se of OPERATIONS in column 2 which is posted to OPERATIONS in column 
3, it would appear that the two are equal and the logic code should be E. 
However, notice that the next term after OPERATIONS, i.e. OPERATIONS 
RESEARCH, consists of two words making two elements in the key to the 
record. For any key with two or more elements or for any single element 
key that matches the first element of a longer key, the logic code must 
contain a T. And so the KWOC helped the person coding entries to select 
the proper logic code. 

Entries for the Lexical Dictionary were coded for keypunching. 
Specially printed coding sheets were used (see Figure 4) to keep the 
various parts of the entry in the proper columns. Three lines (and 
therefore three cards) were required for each one- or two-element key. For 
each additional word in a key, three additional cards were coded, punched; 
and added to the deck. .All cards contained an identifying five digit 
number. The first four digits were assigned consecutively except that the 
same four digits appeared on three cards before the number changed. When 
9999 was reached, the sequence returned to 0001. Since the original record 
that had been numbered 0001 had, already been loaded onto magnetic tape, the 
duplication of numbers was not confusing. The final or fifth digit of the 
identifying number was either a 1, 2, or 3. It indicated which of the 
three parts of the record the card contained. All cards' with numbers 
ending in 1 contained the Iggic code. For a one- or two-element term, card 
1 also contained the first element. Card 2 contained the second element or 
two zeroes. Card 3 held the posting term for that record. For terms with 
three or more elements, card 3 contained a continuation symbol, card 4 held 
the first two elements (separated by a semicolon), card 5 furnished the 
third element, and card 6 the posting term for a three-element key or 
another continuation symbol if any additional words were required for the 
key, and so on. 

It can be seen that for a seven-word term — the longest in the NASA 
controlled vocabulary - it was necessary to code and keypunch (n-l)3 cards 
(where n equals the number of words or elements in a term) or a total of 18 
cards. Fortunately, quicker ways are now available for this job. 

Logic codes that were being used at that time also were more 
complicated than those used now and contained some additional intelligence. 

At that time card 1 for a three element term would have the logic code 
of T; card 4 would have a logic code of TT to Indicate that a table entry 
existed within a table entry. If the NASA posting .consisted of two or more 
terms,, the T or TT on the first of the final three cards required for the 
entry would be followed by an L making the logic code TL or TTL. 

Several programmers recommended that the initial" procedure of creating 
the NLD entries be automated. However, project director Klingbiel decided 
that stopping the manual process to reduce the manual procedures to program 
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specifications; brief the programmers, write, test, and debug the programs, 
and automatically generate the NLD entries would take more time and be less 
cost effective than finishing the job manually. Therefore, the manual 
coding and keypunching continued. (For the next effort, candidate entries 
wsre created automatically.*) 

The ffrimary job of coding and keypunching entries was finally 
completed, but since there were a number ^f errors to be corrected, Sellman 
devised a way of doing this online to speed up the process. 

Duri^the time when the entries were being coded atid the data entered 
. into the file," the records were changed from four fixed-length fields 
storing the logic code, first element, last element; and posting term, to a- 
VSAM file containing three fields: the logic code, the key, and the 
posting term. The key for each record was and is unique. Any record in 
, the file could and c|in be replaced by overlaying another record having the 
same key. In this way, logic codes, and posting terms can be changed. If 
the error is in the key, it is necessary to delete the record and .;dd it in 
its correct form.. 

In the spring of 1982 there were some personnel changes'. On April 1, 
Klingbiel retired from- the Facility, but was retained as a consultant to 
the project. June Silvester became assistant and acting project director, 
^ but this job was taken over in late May for eight weeks by Ron Buchan while ' 
Silvester was on extended iMve. In the meantime, Edna Fleek completed the 
job of getting the file rea% for use. The excellence of her and Buchan 's ■ 
work was attested to by the confidence NASA indexers soon had in the 
accuracy of the NLD output. ' . " 

When all errors were corrected,^ the Phrase Matching file became ~ 
op.erat1onal. This meant that th^LD would find and print out the NASA 
translation for each DTIC term ^hat matched, character for cj^racter, 
either a NASA posting term or Use reference. For example; \ 

Matched DTIC Posting Terfft? NASA Posting Term \ 

Posting term DECODING- DECODING 

Use reference ' DECOMPRESSION PRESSURE REDUtiTION 

. . The -June progress report on the NLD included the following statements; 

The Lexical Dictionary now has about 14,000 records out of a projected 
2O,O0e in the NASA Thesaurus. After the NASA terms have been coded, a 
tape of NASA Terms will be made that 'can be run acjainst the NASA 
Lexical Dictionary tp determine misspelled terms 'as well as missing 
f terms. 

Of the computer identified errors, over 3d0 have been corrected with 
■manual coding and d|ita entry keying. Nearly 200 correctipns have been 
made using ^the TSO direct^entry program which consumes 1/3 of the 
labor of the old method. 
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A- recovery command was developed for the TSO entry system* . for the 
Lexical Dictionary, enabling the entry of data more than once a day. 

First Operational System . . 

The June report also stated tna<;j: 

The Access Routine was tested and has proven workable leaving only , 
questions of format to be cons^iderecf. This means that we hav^^t, 
actually achieved subject switching between DTIC and NASA tenps. 

On the other hand, KlingbieVs Se^pttembir trip report Stated* that: . 

• At this point in time there has Been no Subject Switching with either 
NASA or DTIC dafta, except in the most trivial and incom;plete #nse, 

' because neither file as how constituted contains Subject Switching 
ddta. Subject Switching cannot occur until the present -NLD is 
upgraded with data to be obtained from successful DTIC/NASA, NASA/DTI C 
runs. 

This seeming disagreement with the statement from the progress report 
stemmed from a mfsunderctanding as to the nature of subject switching. We 
reiterate that subject switching is trans lati ng, concepts expressed by one 
or more posting terms from the 'controlled vocabulary ^of a .contributing 
•organization to the same concept expressed in the posting terms from tJ^e 
target vocabulary, also controlled. 

^ The system had achieved the capability of matching input phrwes, 
character by che^racter ~ the first operational segment of the NASA Lexical ^ 
Dictionary system — but the translation of concepts was instituted later. 

In early September 1982, the NLD file was transferred from magnetic 
tap(> to di<;k fifes. Also prpgranwner Sellman left the Facility, turning 
over the NLD development to Duchesne Clark, assisted by Midori 'Keech. 

KlingbieT visited the Facility September 13-24, ironing out problems 
that had arisen durin$| the summer ahd .laying out in detail the steps to be 
taken before his next visit in December. These tasks were carried out by 
the NLD teajp of Buchan*, Fleek, Silvester, Streets, Clark, Keech, and 
programmer Patricia Carr*61T*who joined the 'project in September.' The tasks 
included updating and slightly changing the DTIC Lexical Dictionary, fixing 
a problem that 'had been discovered with the way in which glosses were^ 
handled, updating the DTIC thesaurus authority listing, doing many error 
checks and correction's, and finally producing four printouts and a copy' 
each of the NLD and DTIC's Lexical -Dictionary. 

The first of the four printouts was the result of running ''DTIC's 
posting terms through the NLD which, so far, consisted of just one file and 
a program that could phrase match. This program, provided a printout of 
DTIC terms and matching NASA terms, not only when the entire DTIC term 
matched, character for character, but also when only part of the term 



matched. The listing was sorted by the input posting terms, in this case 
DTIC's. 

The second printout was the same information but sorted by the output 
(NASA's posting terms). . 

The third printout was the result of running NASA's posting terms 
through the new version of DTIC's lexical dictionary. The printout was 
sorted by DTIC^s posting terms (the output). 

Finally, the fourth printout was the same as the third but sorted by 
NASALS posting tenus. 

Collectively these -printouts totalled over 2,500 pages. When 
Klingbiel returned to the Facility on December 6, 1982, "it war determined 
that a more compact presentation of the data -was required in order to 
expedite analysis and data entry. 

Discussions with Clark resulted in some changes and' reprints of the 
four printouts. To avoid cumbersome, nomenclature, the printouts were 
referred to as Books 1 through 4, and identified as follows: 

Book 1- DTIC/NASA sorted alphabetically by DTIC terms 

Book 2 - DTIC/NASA sorted alphabetically by NASA terms 

Book 3 - NASA/DTIC sorted alphabetically by DTIC terms 

- Book 4 - NASA/DTIC sorted alphabetically by NASA terms 

Two of these books were re-sorted. The re-sort ana-lysis conducted by 
Clark resulted in another, software change and finally five copies of each 
book were printed on 8 1/2" x 11" photocopy paper for use by the.NLD team. 

The conclusion of KlingbieTs visit on December TO, 1982 coincided" 
with the announcement of Roxanrie Newton's appointment to the position of 
project director. She had joined the project on November 29., 

Implementation of Subject Switching 

Second Operational System . The data analysis tasks, that were to 
occupy the next few weeks were iderttified and assigned as follows: 

§ .Anal>^1s of DTIC terms with, no mechanically derivable NASA^ 
counterpart (Buchan, Streeks). 

ff Identification of identities between NASA and DTIC terms (Fleek, 
Newton). 

t Compilation of TabTgsri.e. coordinated DTIC terms (Silvester).* 

Another Klingbiel visit to the Facility was scheduled for January 3-7, 
1983. In the meantime, the team did some analysis and obtained some 
hands-on experience with translating DTIC concepts to the same concepts 
expressed in NASA's terms. 
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•'Klingbiel recommended that as the assigned tasks. were being carried 
out, the team: * ^ 

1. -Note anomalous machine translations for subsequent evaluation. 

2. Evaluate alternative data entry methods. 

3. Collect pertinent statistics which would help in estimating the 
total workload. 

Newton recognized and pointed out-that the NLD entries" selected from 
the KWOC.tould be identified even more easily from the NASA Thesaurus. 
This is because logic codes are determined by the initial word position, and 
the presence or absence of significant following word^s. That is, 
significant words in the medial or final positions in a posting term or Use 
reference were of interest only to the extent" that they" existed or did not 
exist. 

vA new data entry method was devised and ^instituted by Newton, 
Silvester,' Clark, and Carroll. At the time of Klingbiel's December visit, 
the four books of data had been categorized by the type of match that they 
supplied between the DTK and NASA vocabularies (i.e., no match, exact 
match, change, and coordination or tables). Except for the "no matc^" 
entries, each kind . of data was' transferred to a dataset that could be 
edited online. Building the da£asets.in this Way kept the files accurate 
since the input had been checked and corrected repeatedly throughout the 
. fall months. 

The data in the four printouts, books, or dataset§. presented a variety 
of problems - most of them anticipated. For instance, "no matches" were 
expected because DTIC's and NASA's vocabularies are designed to- support two 
different missions. Human analysis of the "no niatches" was able to resolve 
about 80% of the cases 'leaving 20% of DTlC's terms with no translation. 
These were zeroed out. As expected, problems in generic, level ,oci:urred in 
two ways: DTIC had specific terms for which tfiere was no equally specific 
NASA counterpart and vice versa. ^ ' - 

A problem hot explicitly recognized prior to the acquisition of the 
foiir books of data was that which was presented by chemical tenns. DTIC 
uses a highly coordinated (Boolean) method of indexing with chemical terms 
that can produce significant false coordinations when more than one 
chemical term is indexed for the same document. No obvious solution was 
apparent. 

It was noted in Klingbiel's January trip report that about 10!i: of the 
data had been analyzed, major problem areas and. solutions had been 
identified, an efficient data entry technique had been devi§ed, and 
anomalous data had been noted and either deleted or corrected. 

The translations of individual DTIC posting terms to NASA posting 
terms continued as assiped. 
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Meanings' of all DTIC terms were examined. Meanings of terms that 
appeared to be identical were compared and translation's corrected when 
homonyms were discovered. The evaluation of candidate coordinated entries 
also was begun, as was internal documentation and a preliminary study and 
test of the NLD. As part of the study, the indexers were interviewed 
individually and confidentially. In addition a test -NLD was created, 
enabling a comparison of DTIC, NLD, and NASA-indexer indexing for a sample 
of 100 documents. 

By April 1, 1983 all DTIC terms jiad been examined' and a translation 
for each had been entered into the DTIC Subject Switching file. With the 
loading of these entries into the NLD, the DTIC tapes could be run through 
the second operational system. That did provide Subject Switching on a 
limited basis. 

The entries consisted of the following: - 

Type of Entry ^ Number Coded 

Exact Match 5400 
Partial Match , 4500 

No Match 3200 

Third Operational System . By April 28 all of the 6,300 table — or 
coordination — entry candidates had been examined. Over 3,000 entries 
were accepted as presented. Others were accepted with additions or 
alterations. The remainder were deleted. The table entrifss then were 
loaded into the NLD and full Subject Switching became not only available 
but also operational . . 

Review and Feedback . The final phase of developing the DTIC/NASA 
Subject Switching capability of- the NLD system began at the end of April 
1983 and is ongoing. This~consists of adding and revising entries based on 
feedback from tfie NASA indexers and on a systematic review of the file by 
the Lexical Dictionary staff.' 
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PROCEDURES FOR BUILDING A LEXICAL DICTIONARY 



4 

Overview of Lexical Dictionary Implementation for Subject Switching 

The NASA STI Facility has already developed the following major 
components of the NLD system: 

• file structures for Phrase Matching and Subject Switching, 

t cod'lnQ procedures for Phrase Matching and Subject Switching 
entries, 

• programs for generating candidate Subject Switching entries, 

• the Access Routine program, 

• online file maintenance and validation programs", and 

• application programs suitable for the Facility's uses of the NLD 

In order to implement the NLD system for another organization, the 
following efforts would be required: 

• modification of the entry creation programs, the Aecess Routine, 
and the online file-maintenance and validation programs to run on 
^ different host system, 

• development of application' programs suitable for* that 
organization's uses of the NLD system, and 

• coding of translation entries to create the Lexical Dictionary 
data files. 

Automated Subject Switching from one vocabulary to another using the 
NLD system can be implemented In four phases. Figure 5 presents an 
overview of these four phases. 

Phase One centers on the construction of a Phrase Matching file for 
the target vocabulary (the vocabulary into which input phrases are to be 
translated). This file consists of entries for every posting term and Use 
reference in the target thesaurus, as well as additional Use references 
constructed specifically for the NLD system. The entries for the file can 
be coded manually or a program can be written to generate them 
automatically from a machine-readable file of the thesalirus. Using the 
Phase One or Phrase Matching file, the NLD system will attempt to match any 
input term or phrase with entries in the f,ile and translate them into 
target vocabulary posting, terms. * ^ " 

In Phase Two, a Subject Switching file is begun.. This file Is 
basically a translation table between the posting terms of a contributing 
source (the input vocabulary) and the posting terms of the target 
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Figure 5 

Overview of Lexical Dictionary Implementation for Subject Switching 
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vocabulary* Entries in \he file pair each input vocabulary posting term 
with the posting tenn or terms from the/ target vocabulary that express the' 
equivalent concept, Candulate entries for this file are created by 
processing the input vocabulary posting terms thrdfiigh * the target vocabulary 
Phrase Matching file created; i^n Phase one. Analysts then evalMate and edit 
these evitries to create the filial Subject Switching file, A separate file 
is built for each input vocabulary to be translated. 



Phase Three adds entries foK coordinations between posting terms of 
the input vocabulary to the Subject Switching file created in Phase Two,v 
These coordination entries represent two or more posting terms from the 
input vocabulary which, when used tnv combination, traYislate to a posting 
term or terms in the the target vocabulary* One way in which Phase Three 
can be implemented is by creating a Phrase Matching file for the input 
vocabulary, processing the target 'vocabulary through this file, and 
analyzing and editing the resulting candiNiate entries. The completion of 
Phase three makes possible full Subject Swi\ching from the input vocabulary 
to the tariget vocabulary. 

Phase Four is concerned with user feedback and file maintenance.. New 
terms added to both the input thesaurus and tnte target thesaurus require 
additions and modifications to entries in the data files. In addition, 
users can supply feedback as to translations tXat should be added or. 
modified. 

The following sections describe these four phases \in' detail. 
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Phase One: Phrase Matching File ^ 

Purpose . The creation of the Phrase, Matchfng file makes it possible 
to attempt to match terms and phrases, from any source (see the subsection 
on Purpose in the INTRODUCTION) with the ta>-get vocabulary. Additional Use 
references from varying :forms of target vocabulary terms, such as 
singulars, "plurals, spelling variants, and gerunds, also are put into the 
Phrase Matching file. The match capability of the system increases with 
the aymber of Use references in the file. The Phrase Matching capability 
can be used for any application requiring the translation of words or 
phrases into the target vocabulary. The Phrase Matching file is used in 
building t.he ■ Subject Switching file and is an essential part of a 
machine-aided indexing system. 

Record piscription . Each record in-'the Phrase Matching file consists 
of three fields: the logic code, the key, and the posting term. 

The logic code in the Phrase Matching file is entered in the first 
column of the record. This code is selected according to prescribed rules 
and provides a weak fprm of syntax for use by the Access Routine in its 
/search for multi-element terms. The logic code also indicates the 
relationship between the key and the posting term(s). 

The key consists of one or more elements. In Phase One, these 
elements are the individual words that make up the target vocabulary 
posting terms or Use references. A single element key will end with a 
semicolon and tv--) zeroes. The key for each entry must be un.ique and must 
be combined witn only one posting term field. Input for the Phrase 
Matching file consists of the target vpcabulary posting terms, thesaurus 
Use references, synonyms for and variants of the terms, which 'become 
additional Use references. 

The posting term field contains one or more posting terms from the- 
target vocabulary. When an input word or phrase matches a key, it is 
translated to the term or terms in the posting-term field. 

For each entry in the Phrase Matching file, it is necessary to 
determine the key, the posting term(s), and the logic code. 

Key . The key of the record being constructed is unique. It is the 
subject of the record" and consists of the words of the term or Use 
reference being described. The Phrase Matching file is based on keys 
created from the target thesaurus posting terms and Use references. 
Additional Use references, such as singulars and plurals, may also be 
added. Each word in the posting term or Use reference is a separate 
element in the key. . • . 

• If the key consists of only one word, a semicolon and two zeroes 
are added following this single element.. For example: 

Term: Controllability 
Key: ControllabilityjOO 
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• If the key consists of more than one word, the words (or 
elements) are separated ty semicolons. For example: 

Term: Geological surveys. 

Key: .Geological ;Surveys ' 

• If the key 'is identical to the first two or more elements of a 
longer key, then in addition to separating the words by 

' semicolons, a semicolon and two zeroes are added following the 
final element. For example: 

Terms:' Charge transfer devices 

Charge transfer 
Keys: Charge ;Transfer;Devices 

Charge;Transfer;00 

SomD' specific formatting rules follow: 

• Hyphenated words or two words separated by a slash are treated as 
a single element. For example: 

Terms: Government/industry relations 
Key: Government/industryjRelations 

• An ampersand (&) is treated as a word. For example: 

Term: Atmospheric & Oceanographic Information System 
Key: Atmospheric;&;Oceanographic;Information;System 

• Parentheses are dropped from around words in the key. For 
exampl e : 

Term: -Hudson River (NY-NJ") 
Key: Hudson ;River;NY-NJ 

Posting Term. The posting term field represents the t«rget 

vocabulary's equivalent of the elements that appear in the key. The 
posting term or terms are enteVed exactly as they appear in the target" 
vocabulary thesaurus. In the Phrase Matching file, posting terms tisted in 
the key field are posted to the sanie term in the 'posting term field. The ^' 
Use references in the key field go t\one or more valid posting terms in 
the posting term field. Multiple posting terms are separated by commas. A 
space is left between words in a postirto term, but not between multiple 
posting terms in the posting field. For e)^ample: 

M Posting. Term(s) 



Controllability;00 Controllability 

Chrome; 00 Chromium 

Geoastrophysics;00 Astrophysics, G^eophysics 

Geological ;Surveys Geological surveys 

Go]d;Plate Gold coatings \ 
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Key 



Posting Term($) 



Gold;O0 

Government/industry ;Rel ati ons 
Hudson;River;NY-NJ 



Gold 

Uovernment/industry relations 
'Hudson River (NY-NJ) 



Logic Code , The logic code indicates the relationship between the key 
and the posting term. Jhe first three logic codes are used with single 
word keys. E indicates that the single word key and the posting term are 
EQUAL or exact matched. For example: 



Controllabiljty;0O 



Controllability 



C indicates that the posting pterin shows a CHANGE from the single word key. 
For example: 



Chrome; 00 



Chromi um 



L indicates that the single word key is posted to a LIST or multiple 
posting terms. For example: ^ 



Geoastrophysics;0O 



As trophys i cs ,Geophys i cs 



If the key contains two or more woS^ds, the logic code is a T. The T refers 
to the TABLE format of the coded file entries. For example: 



T Geological ;Surveys 

T , Go1d;Plate 

T Hinged ;Rotor;Blades 



Geological surveys 
Gold coatings 
Hinges , Rotary , wings 



£ontinuation Entries . When a key exceeds two words, special 
continuation entries must be made for use tn NLD system processing. The 
key for the first of these continuation entries is made up of the first two 
words of the term. The next key is created by adding the next word from 
the term to the key. Additional entries are created in this way until the 
entire term appears in the key. 

A symboT is used in the posting term field to indicate that the 
prograni must continue to look for additional key elements in order to reach 
the proper posting term. The format for the entries required for a term of 
multiple words is a table. For example: 

A term consists of seven words, ABCDEFG, ancj It is to be posted to a 
term of three words, HIF. The entries are as follows: 



Logic Code 

T 
T 
T 
T 
T 
T 



Posting Term(s) 



B 

B;C 

B;C;D 

B;C;D;E 

B;C;D;E;F 

B;C;D;E;F;G 



* 

** 

% 

%% 

%%% 

HIF 
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The asterisks and percent signs in the posting term field not only tell the 
Access Routine that additional elements must be located, but also tell the 
analyst how many elements belong in the entry, how many entries the term 
requires, and in the case of omissions, which entries need to be added. A 
program is available that will create the continuation entries, so they do 
not need to be manually codecl. 

Special Symbols . ' In addition to the asterisk and percent sign, 
discussed under Continuation Entries , other special symbols may be used in 
the posting term field if they are helpful for a given application. For 
example, the NASA Thesaurus designates certain ambiguous or very broad 
terms as Array terms. - The Thesaurus recommends use of a more specific term 
in place of the Array term. When these terms appear in the posting term- 
field of the Phrase Matching-f ile, they are followed by the 0 symbol. This 
symbol alerts the indexers to the , fact that the posting term is an Array 
term. For example: 

E Analysis ;00 Analysis^ 
E , Lifts;00 Lifts@ 

Co ding for Input , the NLD system has an online update program used 
, for adding new entries to the file. For online update, the entry is coded 
as follows: 

* * - 

Logij: code$Key$Posting term 

Elements in the key are separated by semicolons, and single element keys 
.S!^°'^^^ "»°°"- Multiple posting terms are separated by coimas. 
The "$" .indicates the end of a field. 

'Examples of Entries Coded for Online Update: 

E$Control 1 abi 1 i ty ;O0$Contral 1 abi 1 i ty 
C$Chrome;0O$Chromi um 

L$Geoastrophysics;00$Astrophysics, Geophysics 
T$Geological;Surveys$Geological surveys 
T$Hinged;Rotor;Blades$Hinges, Rotary wings 

When the entries are loaded into the Phrase Matching file, the "$"s that 
are used as field delimiters are dropped. The fields are entered in the 
record, as follows:- 

i 

• the logic code in CoTumn 1 

• the key in Columns 4 through 127, and 

t the posting term in Columns 130 through 400 (variable length). 

A full description of the procedures for coding and loading new entries may 
be found in the section on DATA FILE MAINTENANCE. 

Implementation. Figure 6 presents a graphic view of creating the 
target vocabulary Phrase Matching file. It is a fairly simple process 
which involves: 
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v^igure 6 

Phase One: Creating the Phrase Matching File for the Target Vocabulary 




er|c 



• procuririg a copy of the target thesaurus, 

• constructing Phrase Matching entries from the target thesaurus* 
using the procedures just described, and 

• loading the entries into the Phrase Matching file. 

The entries for NASA's original Phrase Matching file were coded manually by 
analysts, keypunched, -and loadfed into the file using a batch program. The 
section on HISTORY provides a Retailed description of this development. 
However, based on the experience gained- from building the original .(file, 
■this process can.now be automated' to -a. large extent. If a machine-reddable 
file of the thesaurus'is available, a program can be- written .to generate 
all of the entries for the posting terms and Use references in the 
thesaurus. Analysts would stilT'be required, to construct additional Use 
references for variant forms of thesaurus .terms. *, An online program is 
available that allows direct online 'data entry tp replace keypunching. 

Validation . A number of .programs have* been written that aid in the 
val idation. of the Phrase Matching file. One preliminary, prqgram compiles 
an alphabetical list of terms appearing in the posting te'rni field- of ^the 
NLD filia. These terms are referred to as the Lexical Dictionary posting 
terms. The Lexical Dictionary keys and posting terms are compared with the 
authority files for thesaurus terms and for Usfe references for possible 
errors. Programs exist for the following comparisons: *• 

Check 



Against 


To Locate 


Lexical Dictionary 


Omissions 


keys i y ^ 




Lexical Dictionary 


Omissions 


posting terms / 




Thesaurus posting 


Non-matches 


terms 


•* 


• 

Lexical Dictionary 


Non-matches' 


keys>. 



Thesaurus posting terms 
and Use references 

Thesaurus posting terms 



Lexical Dictionary posting 
terms 

Lexical Dictionary posting 
^terms 

Once located, discrepancies are corrected using online maintenance 
software* 

Product. The product of Phase One is the target vocabulary Phrase 
Matchitig file and the capability for Phrase Matching input terms and 
phrases with target vocabulary terms. 

Requi red Programs/Tool s . If the manual method of construction is 
used , the following will be requ i red : 

1.. Phrase Matching File - A VSAM file with the record structure 
• described in the section labeled Record Description. 

2. Online Maintenance Software - A program that creates a- load file 
from online data entry of new records, changes to existing 
. records, and deletions of records. 
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, 3. Load Program - A program that loads the load file created by the 
Online Maintenance Software into the Phrase Matching file. 

4. Access' Routine - A program that accepts input words and phrases 
from an application program and returns the posting tems into 
which the input phrases translate. 

• 5. Continuation Entry Generation Program - A program that- creates 
the continuation entries that jire required for keys of three or 
more elements. . • I 

- . ■ •v 

6. Phrase Matching File Validation Programs - A set df error 
checking programs that validate the entries in the Phrase 

• . Matching file. 

a 

An of the above programs are available, but may require modification to* 
run on a different computer system. • . ' ^ 

7. If the automated construction method is. used, all of the above 
programs are required, and a new program must be written to 
generate the Phrase Matching entries. 

« • 

8. If the Phrase Matching file is to be used for 'any translation 
applications in addition to building candidate Subject Switching 

' entries in Phase two,, then an application prooram must be written 

for each intended us?. 

Manpower Estimates . If the -automated approach to file construction is 
selected. It will require an estimated 10 manweeks of labor to build the 
Phrase Matching file. This represents approximately 4 weeks of programming 
effort, and 6 weeks of analysis and data entry effort. 

If the. manual approach is selected, less prograrwning time will be 
required, but the analysis and data entry time will be approximately 
tripled, based on the size of the input vocabulary. 
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Phase Two: Subject Switching. File for Individual Input Vocabulary Terms^ 

^ Purpose . Phase Two provides a limited Subject S\yitching capability. 
It involves the creation of a translation for every individual posting term 
in the input vocabulary expressed in terms of the target vocabulary. The 
input, and output may or may not be the same words, but they must convey the 
same concept. Phase Two is geared to handle simple individual „term 
switches such as those shown below, but not the complex coordinations that 
are addressed in Phase Three. The following examples are taken from the 
DTIC/NASA Subject Switchina file: 

»Logic Code • Key (DTIC Posting Term) Posting Term 

- (NASA translation) 

E Radar;00 . Radar 

C Adenine ;00 Adenines 

C ^ Bases chemistry ;00 Bases (chemical) 

C ■ Carbon carbon composites ;00 Carbon-carbon composites 

C Drilling machines;00 Boring machines 

I Estimates ;00 Estimates?. Estimating? 

L Fluorescent dyesjOO . Dyes, Fluorescence 

Record Description . Each record in the Subject Switchina file 
consists .of the same three fields already described for the Phrase Matching 
file: . , 

. ° • Key . 

• Posting Term 

• Logic Code 

This record differs from the records in the Phrase Matching file in the 
following ways: 

• the logic code is recorded in the second column of the record 
rather than the first, ✓ 

• the elements of the Subject Switching key consist of posting 
terms (which may be single or multiple words) rather than 
individual words, and .. 

• the posting terms that constitute the elements of the key come 
from the thesaurus of a contributing organization. . 

The keys for all entries created in Phase'Two consist of a single element 
followed by a semicolon and two zeroes. As stated- above in Phase Two 
these elements are the single and multi-word posting terms tnat maKe .up the 
input vocabulary. Each key is unique because the contributing 
organization's posting terms are each unique. 

The posting term field represents the target vocabulary posting term or 
terms that express the concept equivalent to the input vocabulary posting 
term in the key. J 
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The logic code provides a weak syntax for use by the Access Routine in its 
processing and indicates the relationship between the key and the posting 
term. . 

For each entry in the Subject Switching file, it is necessary to determine 
the logic code, the key, and the posting term or terms. 

Logic Code , " Phaser Two logic codes E, C, and L are determined in 
essentially the same way as in Phase One. However, in Phase Two the" logic 
code will be entered in the second column of- the record. 

Logic Code )5E, or blank E, .indicates thaft each organization has identically 
spelled terms with .identical meanings as .used in the context ' of each 
environment, and therefore the key and the posting term are exact matches. 
For example:. 

)IE •Europe;00 Europe 

)iE Aircraft carriers ;00 Aircraft carriers 

Logic bC indicates that the posting term in the target vocabulary shows 
some change from the posting term in the input vocabulary. The input term 
may be singular, while the target term is plural. For example: 

)5c Adenine;00 Adenines 

The input term may have a different form of a word. For example: 

)5C Bases chemistry;0O Bases (chemical) 

One term may have a hyphen which the" other omits. For example: 

K Carbon carbon composites J'OO Carbon-carbon composites 

The target term may be different from the input term, but it means 
essentially the same thing. For example: 

)5C Drilling machines;O0 Boring machines 

In each 'case, there is a change in the term but not in the concept or 
subject described by the term. 

Logic Code )5L indicates, that a list of multiple posting terms from the 
target vocabulary are necessary to convey the same meaning as the- term from 
the input vocabulary. For example: ^ 

. )5L Femoral arteries;00 Arteries, Femur 

E-ach of the above logic todes is used for single term entries only. That 
is, the key contains only one' element which in Phase Two is, a posting tferm 
from the input vocabulary, followed by a semicolon and two zeroes. 

Two new codes are used in Phase Two in the -Subject Switching file. 
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Logic Code fiX indicates that the proper translation is context dependent 
and therefpre-. indeterminate and must be an indexer- "choice. An 
indeterminate translation is flagged with a question mark. For example: 

>JI Estimates;O0 . Es.timates?, Estimating? 

The inputs vocabulary has only the term "estimates" to cover both of the 
concept 'of "estimaites" and "estimating" that are found in the target 
vocabulary. /The correct translation must be selected by the indexer based 
on the document at hand. 

In another case, the terms appear to be the sanie but have a slight 
difference in meaning. For example: 

til Performance tests ;00 Performance tests? 



The target vocabulary's thesaurus limits the use of "performance tests" to 
apply only to operating equipment. The organization contributing the input 
vocabulary uses "performance tes,ts" for equipment, systems, or human 
performance. Therefore,' the terms may or may not be equivalent diependlng 
upon the context. The indexer will have to choose. 

Logic Code liO is the only numeric logic code used. Whenever a translation 
of a- term from the input vocabulary is not wanted or when the target 
vocabulary does no;k have an acceptable translation, the logic code used is 
zero (0). For example: 

^50 Peer groups iOO \? 00 

Key . As stated, in Phase Two, the elements of the key are terms from 
the input vocabulary, not the words of a term as in Phase One. The key 
contains only one posting term, and two zeros are addtfd as a place holder 
for the second element. An entry is created for every individual posting 
term in the input vocabulary. 

r 

Posting Term . In the Subject Switching file, the posting term is 
selected by analysts , familiar with both the input and the target 
vocabularies. The posting term field contains one or more posting terms 
from the target vocabulary or the codes 00 or NIS. The contents of the 
posting- term field reflect the best translation that can be made of the 
concept expressed by the Individual term from the Input vocabulary which is 
In the key. Sometimes there will be an exact match between an input 
vocabulary posting term and a target vocabulary posting term. In some 
Instances, the translation will reflect only the addition or subtraction of 
an "s" or a hyphen. In other cases,, the term may change to a different 
term or to a list of terms. A translation may not .be possible or not be 
wanted and the term is "zeroed out." The logic, code is entered as zero and 
the posting term as two zeroes. A term considered Not In Scope is posted 
to "NIS". . . 

Symbols . In Subject Switching, three new symbols are introduced Into, 
the posting term field in addition to the ""Array symbol described under 
"Special Symbols" in Phase One. When one of these symbols is used, it 
immediately follows the term to which it applies. 
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Indexer Choice (o?) 



The question mark, discussed under logic code I, is used when the 
proper trans! ati on i s context ' dependent and therefore 
indeterminate. The Indexer is presented with a choice of terms, 
.each flagged with a question mark. 

• Broader Term Translation (>) 

When the suggested target term is of a broader generic level than 
the input term, the Lexical Dictionary posting term is followed 
by a "greater than" (>) symbol. For example: 

)iC Jugular vein;O0 Veins> ' 

0 Additional Target Vocabulary Narrower Terms (+) 

When the suggested target posting term has narrower terms which 
are not covered by the vocabulary :df the contributing 
organization, a plus sign (+) immediately' follows the target 
posting ternj. For example:. 

)4E Bolts :00 Bolts+ 

(The input vocabulary has no narrower tenns to "Bolts", but 
the target vocabulary has nary^ower terms "Rock bolts"' and 
"Tie bolts".) 

Implementation . Figure 7 presents an overview of Phasie Two 
implementation. ^Tmachine-readable file of the posting terms of the input 
vocabulary is requiiied. This file is processed through the NLD system 
using the target-vocabulary Phrase Matching file constructed in Phase One 
and Phrase Matching logic. For each input posting *tenn either an exact 
match, a partial match, or no match is found. By computer program, base 
files are created that contain candidate entries for the exact matches and 
the partial matches. These files are printed, reviewed by analysts, tneh 
edited online. When editing is complete, they are loaded into the Subject 
Switching file. The no-match group is printed and researched by analysts. 
These nc matches are- translated into target vocabulary equjvalents, if 
possible, or are "zeroed out", that is, translated to a posting term of 00. 
la a few instances, new terms may be added to the target Vocabulary to 
translate these tejrms. A no-match file is .then created using the 
online-update proq/am.' When all entries are edited, they are loaded onto 
the master "Subjecf Switching file. 

VaTidatiqn . Programs exist • for the following comparisons in the 
completed Subject Switching file: 



Check 

Input vocabulary posting terms 
Keys 

Lexical Dictionary posting terms 



Against 
Keys 

Input vocabulary 
posting terms 

Target thesaurus 
posting terms 



To locate 

Omissions • 
Non-matches 

Non-matches 



ERIC 



45 



36 




Figure 7 

Phase Two: Creating the Subject Switching File for Indtvidual Input 

Posting Terms 
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Entries 




' Partial 
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When dtscrepancies are found, corrections are made using online-maintenance 
softv^re. ' 

Product s The .product of Phase Two Is a partial Subject Switching file 
and the capability for Subject' Switching from individual input posting 
terms to target-vocabulary posting terms.. 

Required Programs/Tools . Phase Two development requires four programs 
described in Phase One: 

1. Phrase Matching File - now completed 

2. Online Maintenance Software 

3. Load Program 

4. Access Routine. 

In addition, Phase Two development requires: 

5. Subject Switching Build Program - A set of programs which -process 
a machine-readable file of the input posting terms through the 
Phrase Matching file and ^creates: 

f a file of candidate entries for exact matches, 
« a file of candidate entries for partial matches, and 
a printout of, no matches. 

6. Software for editing the files of candidate entries - software 
package with text editing capabilities such as TSO, SPF,. or 
WYLBUR is helpful. 

7. " Subject Switching File Validation Programs - Error checking 

routines which validate that there is a key to match every input' 
posting term, that all elements of the key are valid input 
posting terms, and that all entries in the posting term field are 
valid target posting terms. 

8. An application program for each Subject Switching application, if 
not already developed in Phase one. ^ 

Manpower Estimates . Coding for Phase Two will require approximately 
two manweeks per 1000 terms in. the input vocabulary. If the additional 
programs required for Phase, Two must be modified- to run on a different 
system, some programming time Will be required. Ih-addition, programming 
required to develop software for the specific applications for which the 
Subject Switching capability is being developed. 
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Phase Three: Sbtnect Switching File for Coordinated' Input Vocabulary Terms 



Purpose. 



Phase Three concentrates on translating concepts expressed 
1 of Multiple input vocabulary posting terms into target 
vocabulary posting tem^. Completion of Phase Three provides full Subject 



rp( 
FcR 



by co^rcfT nation of 



Switching capability. 

Record Description , ^hase Three is an expansion of tKs Subject 
Switching file created in Phask Two; therefore, the record is the same as 
that described in Phase Two. ThKrecord consists of the same three fields: 
the logic code, the key, and the pWing term. The logic code is. recorded 
in the second column, arid -the elemem;s of the key are posting "terms from 
the input vocabulary. The records created in Phase Three differ from those 
in Phase Two in that the key will alwaysxpritain at least two elements and 
that the logic code is always T. The pbsting. term' field contains the 
target vocabulary posting - term or termsXwhich express the concept 
equivalent to the coordination of input postrng terms in the -key. For 
example: • 



>iT Accident investigations; 
Aircraft 



Aircraft accidfeot 
investigation 



For each entry in the Subject Switching file, it is necessarKto determine 
the logic code, the key, and the posting term. Appendix B oorrtains the 
procedures followed for creating DTIC/NASA Subject Switching entries for 
DTIC term coordinations, which can be used as a guide. 

p 

Logic Code . The logic code is always tiT. 

Key . Determining the key is a decision-making process performed by an 
analyst; It is based upon a study of the vocabulary and the indexing 
practices and policies of the contributing organization. The key always 
contains at least two input posting terms which, when taken together 
(coordinated) convey the same concept as the target vocabulary- 'posting term 
or terms in the posting term field for that entry. Continuation entries, 
discussed under Phase One, are required for entries with three or more 
elements in the key. • ■ 

^ Posting Term . The posting term field may contain one or mere target 
vocabulary posting terms. The concept expressed by the posting term field 
should be the same as that expressed by the key. '' 

Implementation . "Figure 8 presents an overview of one implementation 
option for Phase Three. This option consists of generating^ candidate term 
coordination entries by processing the target vocabulary through the input 
vocabulary Phrase Matching file'. All' target vocabulary terms which 
translate into two or more Input vocabulary terms are selected as candidate 
entries. The program formats these entries according to the rules for the 
Subject Switching file. The input vocabulary terms (which were the output 
of the Phrase Matching file) become the keys of the Subject Switching 
entry. The target vocabulary posting term (whi'ch was the Input to the 
Phrase Matching file) becomes the Subject Switching posting term. 
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Figure 8 

Phase Three: Creating the S^ubject Switching file for Coordination of^ 

Input Posting Terms 



To construct the Input Phrase-'Matching File 'if one does not exist: 
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For example: 

Target Vocabulary Posting Term: Abrasion resistance 

Input Vocabulary Translation: Abrasion, Resistance 
(from Phrase Matching file) . 
Creates Subject Switching Entry; 

>IT Abras ion ;Resi stance Abrasion resistance 

Analysts review and edit the candidate entries generated by the 
program. If the contributing organization has a lexical dictionary, it can 
be used to create the candidate entries for Phase Three. (DTIC and the 
NASA STI Facility have lexical di.ctionaries.) If no Phrase Matching file 
exists for the input vocabulary, one can be created using the procedures 
described in Phase One. 

The table entries can also be created by feedback from indexers who 
spot combinations while indexing. 

Another possibility is making a study of ..documents which have been 
inde^xed independently using the input vocabulary and the target vocabulary.- 
• By comparing the lists of posting terms assigned by the two vocabularies, 
coordijiSLtion should become apparent to a trained analyst. 

. ■> 

Any one of these. options, or some combination of them, may be used;to 
create table entries. Whfen all entries are In a file, reviewed, and 
edited, tfiey are loaded onto the master Subject Switching file for 'the 
contributing organization. ' , ' 

Validation . The same validation routines and correction procedures 
used foP Phase Two files may be used for Phase Three* , 

ft 

Products . ' With' the addition of , the coorcfinated DTiC terms to the 
Subject switchi'ng file, the NLD System achieved the capability for full 
.Subject Switching from DTIC indexing to* NASA indexing. The product as the 
indexer sees it;is a printout with two lists of terms, one" from DTIC's 
fields 23 and 25 and the other of the NASA terms to which DTIC's terms have 
been translated. See Figure 9. 

Required Programs/Tools . Phase Three requires the following 
components already described in Phases One and Phase Two: 

1. Online Maintenance Software 

2. Load Program 

3. Access Routine 

4. Subject Switching File Va.Iidation Programs 

5. Continuation Entry Generation Program 

6. Application Program 

In addition, if the automated approach to creating and editing candidate 
entries is selected, the following programs will be required: 

7. Program to generate the input. vocabulary Phrase Watching file (if 
not created for Phase One). 
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Figure 9 

OTIC/NASA Subject Switchfhg Output* 



pjiGi ;ic. 

kOI « 1 l2Ut23 



r.CCC/DHC tlTl 
Iktl « S30i»"C£ 



anclassiiUd capoct 



C9 
0 



23 



25 
27 



.C2 fiaa 08020, tuoso • . . 

CS CEIENSS (iiPPING SGS8CY BYOSCGli lEHICyj! TCEOGBkPBZC CSMtSB HASHIMGION OC 
06 S«p6tt c£ cnA''S F'cototypt cc>p)>lcs txcn snltaccad Landsat iBagscy foe 
APFlto^tlons to Bidcogcaphic Chixtiag. ^ 
Btpt. foe* S-9-AFt 83, . . ^ , • - 

NarlocXIcQZo D.:Iaf ell€tt«,VilUai 
2iF 

*Iaag« Fxoc«s8ing, *Ha«i.qitico charts 

Cctans, Baltiband spec^cal ctcccaaiasanca, Aacial phcto'graphs, optical 
iaagasv Digital systaasr Gxaphica, SKodactioa, Quicic rtactibs, Analog 
ay'staaa » - » 

■LINCSAI satallitas, DIES (Digital laaga Pcocaaai&g Systta) 
Tht Daftosa jlappiaq Agency tCSA) ia cacxaatly dcTalopiag Iptototypa graphics 
£xca xa^otaly saaaad iaaqtry Set support to bydrogxaphic sucTty plaaaiag and 
CBA*a naatical chart aalatasaoc* pxograa. Tha ioagary tos thasa proto<:ypas - 
is ];aadaat scaaas that aca anhaacad bf digital iaaqa pxocassinq tachaiquas, 
dr, pzocassad totally in an analog aoda for gaic> raspo&s« 'raquicaaaats. 'Xhis 
Pjtpax discussas thasa prccaasing approaches vithin tha fxaaavork of th« 
^tctotypa af forts* Laaasat*s.Kaltispactral sca&nar iaagacy in tha aakassar 
itrait of Iadon«sia Is ce.iptitcr anhaQcad to highlight hydrographic 
Lnforaation such as shcals, uncovar araas, la&d'vatar boaadArias, and. 
ihallov vatar dtpth intactali* Ifcaaa anhaacaaants ara graphically prasantad, 
la a variaty of acalas, foriata, aad color assignaants raprasaatinq thraa 
lapproachas to ccaputac .anhaacaaaata. to^pcodoca qaiclc raspottsa graphics, tha 
lanaleg approach to aohancaaant involvas tha usa of a color additira viawar 
[and aultiscala projactcz/tiavar for analysis of Bultispactral/aultitaaporal 
'Xandsat fila. Tla protctypa gxaphics osing this approach vara daValopad to 
SQFport DMA *s chart aainttsanca prcgraa, bat could ba as« as a tool for 
survay planning in shallov vatats* 
33 / 01 



Tljllt 23 DTIC TEUaS 

/ 

12BXAL PR0T0651PBS 

kkncQ sTsizns 

filGITAl STSTiaS 
«SIPEICS 

/laAGl SfCClSSlMG 

eniTZlANC SFZCI5AL B!CON»AISS&SCS 
RAVIGASZCtl CBABXS 
/ OCIAMS 

/ CPTICAl ISAGSS 
SSODOCTICS 
COICK IIACTICII 

?1ILD 25 D5IC TSaaS 

LASCSAl SATIILIT2S 

tin 0IG2311 IRACS FSCCZSSJSG STSIIH 



BASA PCSTZBG IZBfiS 

AZBIAI PBCTOGBAfHl 
ABAICG CA7A 

SXStEHS ZMGINSSBIHG 

ciGiTAii sisxsas^ 

GSISBIC ABXS - 
XBAGZ SBCCSSSZMG4- 
SSICTSAL SSCOHVAISSiBCZ 

BULIISPSCTBAI. PSOXOGRAPHX 
CHABXS 

KA7IG1IZ09 AIDS 
QCZABS 
XBAGIS 
IBCtQCTZONa 
BI ACTIO II Tins 

BASA POSTING ISSflS 

LABDSAT SAXSLLZTZS 
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8. - Table Entry Build^Pfogram - A program which takes the output from 

running the Wget thpsauromhrUugti the input Phrase Matching 
file and creates a file of .candidate entries from the" partial 
matches. ' ^ • 

> 

9. Software for editing the' file of candidate entries. 

Manpower Estimates . The level of effort required for Phase Three will 
depend upon the implementation option selected. It is estimated that the 
automated approach of creating and editing, candidate entries will require 
approximately 2 manweeks per lOQO automatically generated table entries. 
In addition, programming effort may be required for the programs "numbered 7 
and_8 listed above. No time estimates are available for the other options. 
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■ ■ . . ■ / ■ 

Phase Four: User Feedback and Maintenance - , 

Purpose . The purpose of Phase Four is thB establishment of procedures 
for ^handling Updates to the Lexical Dictionary d.ata files based on updates 
to the Input and target vocabularies and feedback from the indexers. 
Figure 10 presents an overview of Phase Four activities. 

Updates to the Input Vocabulary . Whenever a contributing organization 
adds terms to or changes terms in its controlled vocabulary, changes must 
be made in the Subject Switching file. At the very least, one entry>must 
be made for each new individual term, together with its logic code and § 
translation into the target vocabulary. . * . 

In "addition, the input vocabulary should be studied for possible 
additional tables or improved tables which should be entered. Procedures • 
for leaking changes are covered in the section on DATA FILE -MAINTENANCE. . 

It is desirable to make arrangements with the contributing, 
organization for automatic receipt of information on thesaurus changes. 
Without continuing communication between the organizations'^ and thet. 
necessary information for updates, there will be no way to distinguish 
between new terms which should be added to the. Subject Switching file and 
errors. * • 

Updates to the Target Vocabulary . Whenever terms are added or changed 
in the target vocabulary, this must be reflected in the Phrase Matching 
file. An entry will be made for each new posting- term and Use reference, 
and also for variant forms and synonyms. Complete; procedures for adding 
entries are covered in the sectjon on^ DATA FILE MAIhfTENANCE. 

The Subject Switching file will also be affected by the addition of 
new target vocabulary terms. Analysts must look for possible additional' 
tables., improved translations, or for new translation's of terms previously 
zeroed^^out• 

Updates should be made on a regular basis. 

User Feedback . User feedback, such as fr^om indexers, is an important 
part of the intellectual effort in Subject Switching.' With spec\7ic 
documents in hand, indexers are uniquely able to verify whether suggesiied 
translations are appropriate. Indexers can spot new coordinations whicn 
should be added to the' Subject Switching file, or coordinations which 
should be modified or deleted. 

It is anticipated that indexer feedback will suggest: 

• Modifications to translations based on operational experience, 

• Changes of translations based on new terms ia either vocsibulary, 

• Additions of table entries, and 
f Deletions of table entries. 

Feedback must be written and two-way^ between tha indexers and the 

Lexical Dictionary team. An orientation meeting prior to the implementation 

of full Subject Switching is essential for initiating the feedback process. 
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Figure 10 

Phase Four: ^ User Feedback and Maintenance 




The NASA STI Facility has designed a form to streamline the feedback 
process and to encourage the inclusion of all needed information. See 
•Figure 11. , ^ 

Subject Switching Error List. In any application involving Subject 
Switchiiig capability, it is useful .to produce an error list of all input 
terms and partial coordinations which could not be matched in the Subject 
Switching file. Analysts should review these to determine if new entries 
should be added to the file. 

Val idation . To chfeck for accuracy, the validation programs described 
for Phases One, Two, and Three' are run periodically. Any errors which are 
detected by these validation programs are corrected using the online 
maintenance software. . 

Required Programs/Tools . Phase four requires: 

1. An operational NLD System with Access Routine, data files, and 
application 'programs. . ^ l^-^ — 

2. Online maintenance software. 

3. Phrase Matching file and Subject 'Switching file validation 
programs. 

Manpower Estimates . It is estimated that each lexical dictionary data 
file will require approximately 5 manweeks of maintenance each year after 
an initial "shake-down" period. 
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NASA LEXICAL DICTIONARY FEEDBACK " ^ • Date(s) 

. Analyst 



y °^J^l , , , ' ' Reconmended" 

Report No. Input Term NID Translation NIP Translation 



Comments 



^"'^ . Figure 11 
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DATA FILE MAINTENANCE 



Sources of Change to Data Files ' 

Four sources of change to the .NLD data files have been identified. 

• Changes in the NASA Thesaurus . Because the Phrasje Matching file 
contains an entry for every NASA posting term and thisaurus Use 
reference, this file must be updated every time the^ NASA 
Thesaurus is changed. New NASA, terms may replace old 
translations in the Subject Switching file or otherwise change 
translations already recorded. 

9 Changes in the Input (DTIC) Thesaurus . The Subject Switching 
file contains an entry for every posting term- in the input 
vocabulary; therefore, every new input term' must be translated 
and this translation added, to the file. 

• Changes Recommended by Indexer Feedback . These may be for either 
data f i 1 e and are entered following approval by the 
abstracting/indexing supervisor and the NLD project director: 

Phrase Matching file Use references. 

There is an ongoing effort to increase the number of Use 
references constructed specifically for the NLD. These 
consist of synonyms, variant spellings, -and different word 
forms of NASA posting terms. The match capab.ility of this, 
file, designed for general, purpose phrase matching,; 
increases with the number of Use references in the file. ■ 

- , DTIC/NASA Subject Switching file. 

Indexers provide recommendations for improved translations 
based on actual documents in hand. Most of these 
suggestions initiate changes in the Subject Switching file. 

• Changes Derived from Lists . Lists of input terms that find 
either no match or only a partial match, in the NLD are printed 
out -each time that a DTIC tape is run through the NLO Access 
Routine. These lists are called exception listings. See Figure 
12. Th.e first column on the pr-intout shows how many times the 
term or combination of terms was encountered on this tape. The 
second column gives the DTIC accession number of the first 
occurrence. The third column indicates the DTIC field from which 
the DTIC posting term came; Field 23 for descriptors, DTIC's 
controlled vocabulary - Field 25 (unmarked) for DTIC's 
identifiers or open-ended indexing. The final column shows the 
partial matches— combinations of terms th^it are part of a Tonger 
coordinated entry— or unmatched DTIC terms. The ones in this 
example did not translate because of an input error. One added 
and the other omitted an "s". New terms would appear here, too, 
if they had not been added to the NLD. 



04/26/84 DKC TEiaS lOT'FOOiD IM NASA LEXICAL OICtlOMAfiX t 

1 C033784 23 ALTIT0DI:GOID2O BISSILBS 

2 B08091S 23 AL0HIMnfl;C0HP0SIT2 aAIEBIALS 
1139253 23 BfiID6BS:CIBC0ITS 
B080692 23 CIBCUIIS; COMTBOL 

B080766 23 COaaUNICillOIS M£I«OBXS;6L0BAL COHHUNICAIIOHS 

60807 29 23 COflPOSIXE aiTEBIiLS:aiTBIX BATERIILS^ 
1139438 23 DATA PB0CESSIN6 ; DATA ST0BA6E SrSIEHS 

C0337 92 23 DETECIIOH;HISH, ALTITUDE 

A 139261 23 ESTIfliTES; OBBITS 

B080888 23 FIBEfi BEIBFORCEaSHT; GLASS FIBEBS 

B0806 82 23 FLIGHT; SPACE 'FLIGHT 

A139485 23 HiZABDS ^SAFETY 

A 1394 76 23 HIGH BA'rE;INTENSITY 

B080693 23 LIHITATICHS;POHEa 

B0807 99 23 aEASUHEflENT; PABTICLES 

A13927I 23 SLOPE 

,B0807 19 23 TEST flETHODS; THERMAL PROPEBTIES. 

A139216 23 VAPOR 
A 139227 A EXCITCNS 
B08062I A/A37U-15 TOBING REELS 
BO 80823 ABCS AIRBORNE BEAN CONTROL SYSTEM 
A139I55 ACB AIR CUSHION BARGES 
A 139337 ACES AIRDROP CONTROLLEi) EXIT SXSTEH 
B080779 ACOOSTIC HOLOGBAPHI 
A 1394 82 ACOUSTIC IMAGES 
• B0809 16 ACOUSIOOPTIC CELLS 
C0338S6 ACTIVE MASS INJECTION 
B030720 ADAPTIVE ANTENNAS 



ERIC 



49 



Figure 12 
Exception Listing ► * 
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Some of the Field 25 terms may be^ the same as authorized NASA terms 
except for an acronym preceding the DTIC term or for some variations in 
spelling. Any such DTIC terms now initiate new entries into the Phrase 
Matching file. The exception listing also may suggest new coordinations of 
DTIC terms that could be translated to a NASA term. 

To summarize, the files changed by various sources of input are as 
f ol 1 ows : 

Input Material Phrase Matchijig Subject Switching 

NASA Thesaurus update X X 

Input thesaurus update X 

Indexer feedback X X 

Exception listings X X 

The NLO maintenance procedures triggered by each input source will be 
described in the following sections. 

Record Coding 

In each instance, the correct logic code, key, and posting term(s) for 
the record will be determined and written out for data entry. For online 
update, the entry is coded as follows: 

Logic code$Key$Posting tenn 

When needed, the posting tenn will be followed by a symbol, as 
previously described. Elements in the keys of the Phrase Matching file are 
words; in the" Subject Switching file, they are terms from the vocabulary of 
the contributing organization. 

Elements in the key are separated by semicolons, and single element 
keys are followed by ";00"." MuUiple posting terms are separated by 
commas. The "$" separates the fields. 

Here are some examples of entries coded for online updating: 
For Phrase Matching 

E$Analyzing;00$Analyzing@ 
E$ControllabilityjOO$Control lability 
L$Geoas trophys i cs ; OOAstrophysi cs ,Geophys i cs 
T$Geological ;Survey$Geological surveys 
T$Hinged;Rotor;Blades$Hinges, Rotary wings 

For Subject Switching 1 

i 

E$Acids;00$Acids 

.C$Amino plastics;00$Thermosett"ing Resins > ] 
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For Subject Switching (continued) 



L$Animal diseases;00$Diseases, Veterinary medicine 
T$Blood Circulation;Brain$Brain circulation 
T$Blood Circulation;00$Blood circulation^- 
I$Abiotic processes ;00$Abiogenesis? 
O$Acne;0O$NIS 

OlAerial pickup system; 00$00 

When the entries are loaded into the file, the "$"s which are used as 
field dllimiters are dropped, and the online maintenance software 
automatically pl-aces the logic code in the correct column. The fields are 
entered in the records as follows; 

• the logic code in column 1 for the Phrase Matching file and in 
column 2 for the Subject Switching file, 

• the key in columns 4 through 127, and 

• the posting term in columns 130 through 400 (variable length). 



Maintenance Functions 

Thte functions or capabilities provided by the NLD's online maintenance 
system are executed through series of commands. These allow maintenance 
personnel to process input from any of the maintenance sources described 
above. A separate set of commands is provided for each of the NLD data 
files. The chart below indicates the capabilities or functions provided by 
the maintenance system, along with the command used to carry out each 
function for each of the NLD data files. 



Maintenance Functions 



Maintenance System Commands 



Creating Authority Files 
Data File Validation 
Entering Update Transactions 
Loading Transaction Files 
Printing Maintenance Tool 
Printing Maintenance Tool 
Creating Backup Tapes 



Phrase Matching 
File 

VALSETUP 

NASAVAL 

NASAUPDT 

NASALOAD 

NASAPRNT 

NASANVRT 

NASABKUP 



DTIC/NASA Subject 
Switching File 

DTICVSAM 

DTICVAL ■ 

DTICUPDT 

DTICLOAD 

DTICPRNT 

DTICNVRT 

DTICBKUP . 



The commands listed above, in -addition to several miscellaneous maintenance 
commands, are explained in more, detail in the section on "Maintenance 
Commands". 

Additions of New Records . To add a record to any file, use the 
appropriate update command as indicated in the table of online maintenance 
commands above. The form of the entry is: 
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Logic Code$Key$Posting term either with or without a symbol. 

Deletion of Existing Records . To delete a record, from any file, use 
the appropriate update command as indicated in the table of online 
maintenance commands above. Enter DEL dollar sign and the key of the 
unwanted record. For example: 

Existing record: , 
T$DISTRIBUTION;PARAME.TERS$DISTRIBUTED PARAMETER SYSTEMS proves to be a 
poor choice of coordinated terms for translation". To prevent the 
coordination of these terms in future translations,- the record must be 
deleted. 

Enter: OEL$D I STRIBUTION; PARAMETERS 

. Changes to an Existing Record . To change the key, 'use the appropriate 
update command as indicated in the table of online maintenance commands 
above. Delete the existing record and add the record In its correct form. 

Existing record: E$ERUOPE;00$EUROPE must be deleted as the key is 
misspelled. 

Enter: DELSERUOPEjOO to erase the error and 

Enter: E$EUROPE;00$EUROPE to add the correct record. 

Changes to Logic Code Field of a Record . To change a logic code, use 
the appropriate update command as Indicated in the table of online 
maintenance commands above. Re-enter the record in its correct form. 

For example: 

Existing record: E$PHOTOGRAPHIC;EMULSIONS$PHOTOGRAPHIC EMULSIONS 
should have a logic code of T. 

Enter: T$PH0T0GRAPHIC;EMULSl'0NS$PH0T06RAPHIC EMULSIONS 

Any logic code entered will replace any previously entered logic code for 
that same key. 

Changes to Posting Term Field of a Record . To change the posting term 
peld in any way, use the appropriate update command as indicated in the 
table of online maintenance commands above. Re-enter the record In its 
correct _form. 

Existing record; E$MEDICINE;00$MEDICINE should have an array term 
symbol following the posting term. 

Enter: E$MEDICINE;00$MEDICINE@ 

Any posting term(s) entered will replace any previously entered posting 
term(s) for that same key. f n 
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Symbols 

Symbols should be used as needed. These have been described at some 
length in the subsections on Symbols under Phase One and Phase Two. . 

Logging On ' - ^ 

Additions, deletions, or changes to any NLD record are done online. 
One user ID has been designated for NLD file maintenance; a second one is 
^available for data entry only. Follow the log on procedure for whatever 
database management system used, and when the system prompts that it is 
ready, type in , the desired command. ' NASAUPDT is used to correct the Phrase 
Matching file or DTICUPDT is used to maintain the DTIC/NASA Subject 
Switching file. The use of either of these commands creates a dataset of 
entries which will be used to update the master NLD file. Errors in this 
dataset are corrected online also. 



Maintenance Commands 

The NLD maintenance system provides a series of commands that are used to 
accomplish file maintenance activities. For each type of online activity, 

there are normally parallel commands for each data file. The corresponding 

command for the NASA Phrase Matching file usually begins with the letters 

•'NASA". The command for the DTIC/NASA Subject Switching file begins with 
the letters "DTIC". 



Commands 



Functions 



Provides NLD translations online 
Creates backup tapes 
.Creates continuation entries 
Displays file entries online 
Loads transaction files 
Prints file, alpha by postings 
Prints file, alpha by key 
Counts entries, sorted by 

logic code 
Unloads file for large-scale 

editing 
Enters update transactions 
Validates file entries 
Creates authority files 
Displays records online 



Phrase Matching 
File 


DTIC/NASA Subject 
Switching Files 


DTICACC 


1 

DTICACC 


NASABKUP 


DTICBKUP 


NASACONT 


DTICCONT " 


NASAFIND 


DTICFIND 


NASALOAD 


DTICLOAD 


NASANVRT 


DTICNVRT 


NASAPRNT 


DTICPRNT 


NASATOT 


DTICTOT 


NASAUNLD 


DTICUNLD 


NASAUPDT 


DTICUPDT 


NASAVAL 


DTICVAL 


VALSETUP 


DTICVSAM 


PRINT IDS 


PRINT IDS 



These commands are described more fully in the pages that follow. 

DTICACC This command processes an input word or phrase through the Access 
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Routine. It provides, on the terminal screen, the full or 
partial translation of the input material, if any translation 
into NASA terms is available through the NLD. Otherwise, the 
program returns the message: 

UNABLE TO IDENTIFY 

The command can be used to see how the NLD will translate phrases 
or groups of terms that do not appear on a tape.^ 



DTICBKUP 
NASABKUP 



DTICCONT 
NASACONT 



DTICFIND 
NASAFIND 



DTICLOAD 
NASALOAD 



These commands create backup tapes for the VSAM master file§: 

'NLD.SSDTIC.MASTER'and ' ' 

'NLD.NASA.MASTER' respectively. 

A backup is run after every file update so that the most curf'ent 
backup tape always reflects the current status of the VSAM file. 
Three successive backup tapes are retained in the tape library 
for^each file. When a new backup tape is created, it r^eplaces 
the oldest existing backup. An entry is recorded in -the File 
Backup Log. (shown in Figure 13) for each successful run of a 
backup command. The job printouts for the last three backup jobs 
are also kept for reference. 

These commands initiate jobs that read every. entry in the data 
file, generate all required continuation entries and add them to 
the file, and when a new continuation entry has 'a key identical 
to an existing posted entry, adds a ;00 to the end of the key of 
the posted entry. DTICCONT and NASACONT are used only when an 
update is so large that coding and entering continuation entries 
individually is too time consuming to be economically feasible. 
The commands are executed after the update and at the end of the 
work day so that the programs can be run overnight. 

These commands search the data files for a specified key, and 
print at the terminal ten sequential Lexical Dictionary records, 
beginning with the key requested, if it exists. If the requested 
key is not found, the program will locate the sequential position 
in which the key should occur and print the next ten records. 

These commands load additions and corrections from the dataset 
created by the UPDT command, that is LEX.DTIC.MOD or 
LEX.NASA.MOD, into the appropriate master file in order to update 
it. For DTIC the master file is 'NLD.SSDTIC.MASTER' and for NASA 
it is 'NLD.NASA•MASTER^ The Load command performs a number of 
edit checks on the transactions. Transactibns passing the edit 
checks (good transactions) are loaded into the master file and 
are deleted from the work dataset. Transactions rejected by .the 
edit checks are not loaded^ but are rewritten to the appropriate 

LEX._^ .MOD dataset for correction. Rejected transactions are 

listed on the printout with a notation of the error which caused 



54 66 



Figure 13 
FILE BACKUP LOG 



NLCf.NASA. MASTER ' 
BACKUPS 

Da-te Time Job"* Initials 



NLD.SSDTIC.MASTER 
BACKUPS 

Date Time Job# Initials 



An. 
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the entry's rejection: The person doing the NLD maintenance 

corrects rejected transactions in the LEX, •MOD dataset and 

then re-executes the Load command. 

DTICNVRT These .commands print the ^master files sorted alphabetically by 
NASANVRT posting terms. In order to readily locate a particular posting 
term in the NLD, it is necessary to have a print of the . file 
sorted alphabetically by posting term. Entries with multiple 
posting terms are listed once for each posting term. The Invert 
Print commands above sort . and print the f ol 1 owi ng f i 1 es , 



respectively: 



'NLD.SSDTIC.MASTEtf' 
'NLD.NASA.MASTER' 



Sample pages of NASANVRT and DTI.CNVRT are shown in Figures 14 and 
15. • • 

DTICPRNT These comiiiands generate prints of the master files sor 3d 
NASAPRNT alphabetically by keys. The files are, respectively: 

' NLD. SSDTIC. MASTER' 
'NLD. NASA. MASTER' 

Sample pages- of NASAPRNT and DTICPRNT are shown in Figures 16 and 
17. 

DTICTOT The Total ' conmand p|>ovides a count of the number of entries in 
NASATOT the appropriate data file, broken down by logic code. Error 

messages are written for entries that do not have a valid logic 

code. 

DTICUNLD The Unload command copies the entries in a VSAM data. file into a 
NASAUNLD series of smaller sequential files that can be edited online. 

These sS'quential files contain 3,000 entries each and have extra 
space allocated for additions. The job creates as many 
sequential 'files as are needed to hold all of the VSAM file 
entries. The files are named in this pattern: 

LEX.SEQl.DTIC or LEX.SEQl.NASA 

LEX.SEQ2.DTIC or LEX.SEQ2.NASA 

LEX.SEQ3.DTIC or LEX. SEQ3. NASA 

LEX.SEQ4.DTIC, etc. or LEX. SEQ4. NASA, etc. 

The entries- in these sequential files are in the following 
format: 

Columns 1-3 Logic Code 

Columns 4 - 127 Key " • 

Columns 130- 400 Posting term / 

■When editing is completed, the corrected files are loaded into 
the appropriate data file. Programmer assistance is required for 
this " reload, so mj(intenance personnel are cautioned not to 
attempt this reload themselves. 
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, 06/28/83 

EE' REFINIKIG:00 ' . , 

C REFLECT lVlJY:0(r 

T COEFFlCiENTS:REFLECTlON 
EE REFLECTANCE : 00 

T KAOlATtON;R£FLECTlON 

T REFLECTION: WAVES 

T REFLECT ION: TELESCOPES 
C iNTERUAt !)EFLECTION:00 
L DIFFUSE REFLECTION:00 

-TE REFLECT 10N:00 

T NEBULAE ;REFLECT10N \ 

EE REFLECTCll^ETERS:00 

L REACTOR REFCECTORS :00'^ 

EE REFLECTORS:00 
L VASOMOTOR ^REFLEXES:00 

EE REFLEXES:00 
EE REFORESTAT10r4:00 
L ACOUSTIC REFRACTION':00 



T REFRACT 10N:TELESC0P^S 
EE REFRACT I0N:00 
C REFRACT IVE INOEX:00 
'EE REFRACTOMETERS:00 

E REFRACTORY CaATlNG5;00 

L )HEAT RESISTANT MATERIALS:00 

T HIGH TEMPERATURE:MATERIALS 
E REFRACTORY M^^ TERI A LS: 00 
E REFRACTORY ME-TAL ALlOYS:00 
E^REFRACTO'RY METALS:00 
L HEAT RESISTANT METALS:00 

EE REFRIGERANTS: 00 
I l^£FRIGEPANT COMPRESSORS : 00 

L REFRIGEPANT C(|HDENSERS:00 ' 

T MACK)NES:REFR2GERATI0N SYSTEMS 
C REFRIGEr^ATlON SYSTEMS:00 
I COLD STORAGE: CO 

L CLOSCD CIRCUIT REFUELING:0O 

L REFUEtlNG PUMPS:00 

EE REFUELINO:00 

C REGENERATION ELECTRONICS:00 

E REGENERATION ENGINEER ING : CO 
,T CYCLES: REGENERATION ENGINEERING 

C^SUPERREGENERATION:00 . 

E REGENERATION PHYSIOLOGY : 00 

E REGENERATIVE C00LING;00 

r FUEL CEti.S:REGENERAriON ENGINEERING 

t GAS TURBINE REGENERATORS: 00 



DJIC LEXICAL DICTIONARY BY POSTING TERM 



REFINING^ 
REFLECTANCE 
PEFLeCTANCE . 
REFLECTANCE 
REFLECTED WAVES \ 
REFLECTED WAVES?. 

WAVE REFLECTION? 
REFLECTING TELESCOPES 
REFLECTION 
REFLECTION. 
. DIFFUSE RADIATION 
REFLECTION 
REFLECT lOU NEBULAE 
REFLECTOMETERS 
REFLECTORS. 

NUCLEAR REACTORS 
REFLECTORS+ > 
REFLEXES. 

NERVOUS SYSTEM 
REFLEXES^ 
REFORESTATION 
ACOUSTIC ATTENUATION. 
, REFRACTED WAVES 
REFRACTING TELESCOPES 
REFRACTION 
REFRACTIVITY 
REFRACTOMETERS J 
REFRACTORY COATINGS 
REFRACTORY MATERIALS. 
' THERMAL RESISTANCE 
REFRACTORY MATE)?IALS 
REFRACTORY MATERIALS* 
REFRACTORY METAL ALLOYS 
REFRACTORY MEtALS 
REFRACTORY METALS. 
' THERMAL RESISTANCE 
REFRIGERANTS 
COMPRESSORS. 

REFRIGERATING MACHINERY 
CONDENSERS ( LIQUEFIERS), 

REFRIGERATING MACHINERY 
REFRIGERATING MACHINERY 
REFRIGERATORS 
ENERGY STORAGE?. 

REFRIGERATORS? 
REFUELING. 

AUTOMATIC CONTROL 
REfUELlNG. 

FUEL PUMPS 
REFUELING 
REGENERATION 
REGENERATION 
REGENERATION 
RtGENER'ATlON 
REGENERATION 
REGENERATIVE 
REGENERATIVE 
REGENEf^ATORS* 



(ENGINEERING) 
(ENGINEERING) 
(ENGINEERING) 
(ENGINEERING)> 
(PHYSIOLOGY) 
COOLING 
FUEL CSLLS 



Figure 14 OTICNVRT Sample Output 
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NASA LEXICAL OlCTIOIj/.RY Qi HOSTlriG 1 1 KM, 



T OmtNtfRlAlttCKArT 

T DU;N|Lli.PAr<Ar>L10ER;R0CK£T:V£HlCL£ 

T Dpl;SAL:t.tCllCllS 

C DOS»E;OCi 

E UOSAGC:(iO 

C DOJilMflftYiOO 

E DOMMEIIKS:00 

L COiNDITE.ua 

T b0IJ3LC.IlASL:MK>PELLANTS 

T DCMIUI E : liiitth : ttOCKE V : TROPE L LANT S 

C 0S(.U1 Al l(>N£>:«>a 

T ocMHii t.(:u:>r:» 

T n(JIIHLCVI'KI.ClMON:ARnt)nET]C 

T DoOrilX : SI DL (SAND : IHANSMI SS ION 

T D(hj(;las::aiw(:i<afi 

E OC'Wrj-CONVLinLKSiOO 

I uowut inKiiJC:6o 

T b(JV^URAUOL ; DO 

T Dr^i':K;OwRAM 

T D(^rmAi.rii .ant imisskeimeasurement: program 

T bOWNKAliCl :MLASUREM£N1 

C D(MNMMf.;U0' 

E DCiWUWAiJi.OU 

T ORACOHlD.Mt IIOROIDS 

T DKAn:r.Ab:r LOW 

T DIMl-T:(iO 

T 0»AF I ttxid: DRAWING » 

T OltAFIlNCt. MACHINES 

T I)»AG: EFFECT 

T UIMG:OCr . - , 

T DRAGsCnuiES 

T DROGUl.; PARACHUTES 

T DWAGjCOFIf iCICIlTS 

L DKAGULAIUR:*;U0 

T DRAGibiVICLS 

T bRA(t:»URCL:Ai:LMOMeTERS 

T nRAG;M( ASUItLMrin 

T IilfAGvJ<l liUCI ION - - 

C ORAl MING. Oil 

C RtttlOriS.lUl 

T .DRA1NAOF:OU 

.T DHIDKII JCOkAINAGE 

T DRA lNAC.t.PAl URl,f, 

T linEKLA<;iNl.:rmAlNAG£« 

T RAUlAl .liUAltaCE :PATTERNS 

T RLf :t ANt.Ul A^ : Dl4 A I fMGE 

E nh'AWlN;..(H) 

£ rjRAWINC.S.OO 

T ^L£VAli(^4t>.UI<AWlN0S 

E DRUAMS : (lO 

T URC.nuCU:MAiEIMALS 

E DRLDGUlC.Oo 

T DR1IT;1N£>IRUUCNTATICN 

T INSTftW,1LNT:DRIFT 

T DRIFT ;«AI£ 

T DRlFFiOO 



DUNN 11 H AllrCRAFT / 
DOIHIILR PAIAGLIDER ROCKST VEHICLE 
Dul^^AL SICIIONS / 
UOS.U.E / 
DOhAt.i /. 
DOSII.XURS / 
D0r.H-FH»5* / 
COLICIIDAI IROPELLANTS^. ^ 
UOIJOLC ISASE PROPLLLAUTS 
DOUniE i:ASL PROPELL^UG 

i:asi ROCKEI PRO^ELLANIS 

aiui s 

CU5.IS ; 

PRCf.lSION AlRITHMETlC 
UU18AND TRANSMISSION 
AMiCRAFT I 



I 



ANTIMISSILE MEASURfMENT PROGRAM 
ANTltAlSSILE MEASUMlMENT PROGRAM 
'••EASMREMENT 



bOlMilE 
UOtllUC 
boUlilE 
OUUlllC 
DOUGIAS 

bOWri-COMVI IcTERS 
b())iNl li:i.ll.<4 
bowM*At:rii 

bOWI«RAt4(.L 

downran(;f 

D()WI;i<AnGC 
DO W| .WASH 

DR/^U'inD MlTEOfiOlDS 
DRAf I (GAS FLOW) 
URAf Iw. ^ . 

DRAniUG HEAVING) 
DRAF tlNG MACHINES 
UUAO 
DRAG ^ 
DRAG CMlJTtl 
DRAG CIIUIK 
DKAG Ctll.l f IC1£NTS*> 
DRA(> Dl Vict S. 

llk/*Kl!i (I6r ARRESTING lkK)TlOt4} 
DRAG DlVlCIS 
DRAG FONCl /anemometers 
DRAG Ml A*.ur EMCNT 
DkAU Rl.|)UC/tION>^ 
DRAItiAdI / 
DRAllJAGC 
DRAltiA(.L ' 
1»I*AIIIA0C PATTERNS 
DRAlKAt.C PATTERNS 
DttAltlAGL PATTERNS 
DRAjNAdC PATTERNS 
IIRAINAGC PATTERNS 
DUAWinr. 

imAmriNG'; 

DRAwiNGti 
URI AMb 

ltKLlM.rtJ MATERIALS 
llKf U(*ljiG 

DIM IT i 1H!>IRUMCNTAT]0N) 
DRIFI i INMRujittNTATlOrr 
DRIFT RATE 
OR If 



Figure 15 NASANVRT Sample Output 
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OTIC LEXICAL DICTIONARY 
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T FOUS I1AUHIaLS:METALS 

E FOILS I.UTERlAtS:00 

C FOKKER laANCK E0UATI0NS:CX) 

C fOLDEO OkllCAL LENSES:00 

' C FOLDING FINS ROCKETS: 00 

L FOLDING HLLICOPT£R HOTORS:00 

L FOLDING W1N0S:00 

T F0LDING:S1RUCTUR£S 

TC F0LDING:00 

E FOLU^ GbOLOC»Y:00 

Ei. FOLlACCiOO 

E FOLIC ACIO:00 

C I'OOD' CliAlNl»:O0 

C fOOD Cor«:.UM»'| !ON:OD 

C FOOD OEI'RlVATIONlOO \ 

> L FOOD -OEI ERI0RAT10N:00 

0 FOOD OlSHENSlNGiOO 

0 <'00D HANDLLreS:00 

L FOOD POlSONtNG;00 

L FOOD PRESERVATION :00 

E FOOD PROCESSING: 00 

0 ^ FOOD SEHVlCr PERSON^JEL:00 

0 FOOU SERVICE: 00 

T FOOD:SYNTH£1 IC MATERIALS 

TE FO0D;O0 

C FOOT ANp MOUTH DISEASE , VIRUS (00 

1 FOOTWEAR: 00 



MFJAL fOILS 

F0IL5« (MATrRlAL5)f 

FOKKLP-PLAJICK EQUATION 

LLtibLS^ 

fOL'DING TIN AIRCRAFT ROCKET VEHICLE 
f-OI.U:NG. 

ROTARY WINGS 
(OLDINC STRUCTURES. 

WUKiS 

FULUtNG STRUCTURES _ 
f OLDING 

FOLDt> t GEOLOGY) 
FOLIAGE 
rOLiC ACID 
(000 CHAIN 
fOOp Un/HE 
fOOU INTAKE 
DEIEfaORAllCN. 
rOOD-r. 

00 
00 

FOOD INTAKE. 

POlSONlr:G(t 
FOOD PROCCt>SlNG. 

PRFSLRVIriG 
FOOD PROCEiiSlNGi 

00 . - - 

00 

SYNTHETIC TOOO 

FOOD' 

VI RUSE S> 

000Tb (FOOIWEAR)?. 

SHOES?. 

socks: 



c 


FORAMINIFERA:00 






f'R0lOrOA> 


T 


rORCE l^tECMANlCS.FREE FIELD 








T 


FORCE MLCHAN|CS:FREE F lELO :MAGN£TIC FIELPS 




fORCE-fREE MAGNETIC FIELDS 


T 


FORCE MLCllANICS: INERTIA 






INERTIA 


C 


FORCE MI:CMAN1CS:00 






LOAQS CFORCES) 


0 


F0RDIN(«:0U 






OU 


EE 


FORECA<^t ING:00 






FORECAST Ir^Cf 


C 


FOREIGN AIDUH) 






fOREIGN P0L|CY> 


C 


FORUON LAIK>UAGES:00 






LANGUAUES> 


E 


fORClC.N ('OLICY:00 






FOHEiGN POLICY 


0 


F0HEI6N SERVICE 0FFICERS:00 






00 


0 


FOREIGN rECHN0L0GY:00 






OD 


0 


FORfclGN:00 






NIS 


E 


FOREST F|RES:00 






fOREST FIRES 


C- 


FORESTRY; 00 






FOREST MANAGEMENT 


T 


FORESTS;MANAGEMENT 






FOREST MANAGEMENT 


T 


F0RESTS:RAIN 






RAIN FORESTS 


EE 


FORESTS:00 






FORESTS, 


C 


FORGE PR£SSES:00 






I»ft£SSEb> 


T 


FORGING: METALS 






FORGING 


T 


rORGlNGiSPlMNiNG MOHON 






METAL SPIN^NG 


E£ 


FO"iClNG;00 






FORCING 


0 


FORKLIFT VEHICLES:00 






. 00 


T 


FORMA LDEHYDE : PHENOLS 






PHENOL FORAULDEHYDE 



Figure 16 OTICPRNT Sample Output 
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SOLAR: THERMAL 

SOLAR: riiEK&1AL:ELCCTRlC 

$OLAII:TIIEnMAL:ELECTRIC:POWER 

SOL AR: TllEf<M(\L: E L ECTR I C : POWER: PUNTS 

SOLAR : THCKMAL : PROPULS I ON 

SOLARMOTAL 

SOLAR. TOTAL iCNERCY 

SOLAR: TOTAL ; ENERCYrSYST EMS 

SOLAn:V£LOCITY 

S0LAR;WINI) 

SnLAR:WIND:VCLOClTY 

SOLAR:WIMO:00 

SOLAR:X'RAYS 

SOLOEI<ED:UOINTS 

S0LDfLlMNG:0O 

S0LOCKS:U0 ^ 

S0LLN0IO:VALVES 

S0lCN0IDS:OO 

sol'ettaG:uo 

S0LlO:At<(«ON 

£DLIO:(.KY0(«EN 

SOL I D:CKY0GCN: COOLING 

SOLID:CIIYOl.fc*Nb 

S0lID:ELECIR()OES 

SOI IO:ILE;CTKOLYTES 

SOLlD:LUUrMCANTS 

S0LlO:Ni IltUGLN 

soLiorPiusas 

SOLlL>:PRCtt'ELLANT 

SOL I 0:PROiaL LANT :COr.lBUST ION 

SOLlDrPROPf LLAN7 . IGMlTION 

S0LID:PROPCLLANr :ROCKET 

SOL 1 0:l>ROPrLLANT: ROCKET: ENGINES 

SOLlO.PiiOl'ELLANTS 

SOLlO^rtOCKtT 

SOLID. ROCKET: BINDERS 

SOL I D:l<OCKE T : PROPE LL ANTS 

S0LlD:ROTATieN 

SOLlD:bf>LUTI0NS 

SOLID. STATE 

tOlID:STATE:OEVICES 

SOL I l):SI ATE: LASERS 

SOLlD;STATE:PHYSICS 

SOlID:bTAlL:00 

SOLlD:SURf ACES 

S0LiD:subPLrjSiot:s 

S0LID:WASTI:S 
SOLiD-ROCKCT:00 
SOLID-SOLID: INTERFACES 
SOLlDIf ICATION:00 
S0LIDIFUD:GASES 
SOLlDb:FLOW 

soLiDS:oo ; 

SOLIDUS:00 

S0Llf:NS:00 

SOLI TARY :WAVES 
SOLI THANHS: 00 
SOLI TONS: OC 



SOLAR THERMAL ELECTRIC POWER PLANTS 
SOLAR TlHRr.UL PROPULSION 



SOlAR TOTAL rNERGY SYSTF.MS 
SOLAR VELOCITY 

4 

SOLAR WIND VELOCITY 
SOLAR WIND 
SOLAR X-RAYS 
SOLDLRED JOINTS 
SOLOIRING , 

SOLDI RS ^ ^ 
SOl'tMOlD V/^LVES 
SOLLKOIDS 
S0LETTA5 

SOLIDIFICD GASES 

SOLID CRYOGEN COOLING 
SOLID CRYO(.ENS 
SOLID ELECIRODES 
SOI ID ELCCIROLYTES 
SOLID LUURICANTS 
SOLID NITROGEN 
SOLID PHAStS 

SOLID PROPLLLAMT COI^DUSTION 
SOLID PROPULANT IGNITION 

SOLID PROPLLLANT ROCKET ENGINES 
SOLID PROl'fLLANTS 

SOLID RDCKIT OlnD£RS 

SOLID ROCKtT PROPE LLANTS 

ROUTING U<*niES 

SODD SOLUIIOUS 
I 

SOLID STATL DEVICCS 

SOLID STAU LASERS ^ 

SOLID STATE PHVSICSO 

SOLID SIiAU 

SOLID SURtACES 

SOLID SUSPINSIONS 

SOLID WASUS 

SOLID (-ROPCLLANT ROCKET ENGINES 
SOLID-SOLID INTERFACES 
SOLIDIFICATION 
SOLIUIMLD GASES 
SOLIDS FLOW 
SOI IDS 
, SOLlDUb 
SOL IONS 

SOLITARY WAVES 
SOL I THANES 
SOLITARY WAVES 



Figure 17 NASAPRNT San)|)le Output 
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SIJJHnSI T^^^® commands utilize work datasets for compiling changes 
NASAUPDT .including additions o»^ deletions, which are intended for the 

master files. These work datasets, respectively, LEX.DTIC.MOD 

and LEX.NASA.MOD can be edited online. 

As a. transaction is entered at the terminal, a series of edit 
. checks take place. Transactions passing the edit checks (good 
transactions) are loaded into a temporary work datiset. 

Rejected transactions generate error messages online. 

The error messages that are returned interactively by the system 
follow: 

^ INVALID CHARACTERS 

The transaction contains characters other than the following 
valid set: A-Z. 0-9, +, ?. >,&.'. $, (, ), %, *, /, i^] 

-,. or blank. 

INVALID CHARACTER IN LOGIC 

The_ transaction contains characters other than the following 
valid set in the logic code position (or before the first dollar 
sign): 

DEL, C, E, I, L, T, 0 (zero), or blank. 

LOGIC CODE TOO LONG 
■ More than three characters or blanks appear before'the first $ in 
the transaction. 

LOGIC CODE ALL BLANKS 

Three blanks appear before the first $ in the transaction. 
NO POSTING TERM 

Nothing appears following the second $ in the transaction. 
TOO MANY $'s 

More than two $'s appear in the transaction. , 
INVALID FORMAT 

The transaction does not conform to one of the formats: 
Logic codeSElementiElementlPosting term 
Logic code$Element;00$Posting term 
DEL$(Key of record to be deleted) 

COMMAND NOT FOUND 

If this error message appears following any command, check to 
make sure that you are logged on under the maintenance ID and 
, that you have spelled the command correctly. If the problem 

persists or if other error messages appear, check with the 
- .application programmer assigned to the Lexical Dictionary. 

An error that generates one of these messages must be corrected 
in the manner indicated before the system will accept the entry. 
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When the session is ended by entering /*, the system performs -a 
•second series of edit checks on the transactions held in the 

temporary file, and loads the transactions into the LEX. ^.MOD 

file appropriate to the coiranand. These files are respectively, 
LEX.DTIC.MOD and LEX. NASA. MOD. Rejected entries are listed on 
the printout under the heading TRANSACTIONS IN ERROR and must be 
' researched, reformatted if necessary, and entered correctly using 
the online edit capability of the data base management system. 

DTICVAL These commands initiate comparisons between data files and 
NAS'AVAL authority files. DTICVAL compares the OTIC/NASA Subject 

Switching file entries with the NASA and DTK thesauri authority 

files. DTICVAL checks: 

t Every DTIC term appearing in the key field against the 
DTIC Thesaurus authority file. If a term in the NLD 
key does not appear in the authority file, an error 
message is generated. 

f Every NASA term appearing in the posting term field 
against the NASA Thesaurus authority file. If an NLD 
posting term does not appear in the authbrity file, an 
error message is generated. 

f Every posting term in the DTIC Thesaurus authority file 
^ against the NLD keys. If there is no key in the NLD 
for the Thesaurus posting term, an error message is 
generated. 

These error messages highlight the additions, modifications, arid 
deletions required in the DTIC/NASA Subject Switching file. 

NASAVAL initiates a set of comparisons between the entries in the 
NASA Phrase Matching file and the NASA Thesaurus authority files. 
NASAVAL checks: 

9 Every term appearing in the posting term field against 
the NASA Thesaurus authority file. If an NLD posting 
term does not appear in the Thesaurus authority file, 
an error message is generated. 

• Every posting 1;erm and Use reference appearing in the 
NASA Thesaurus authority file against the NLD file 
keys. Each of these terms should appear as a key in 
the NLD file, and an error message is generated if it 
does not. 

f Every posting term in the NASA Thesaurus authority file 
against the NLD file posting terms. If a Thesaurus 
posting term does not also appear as an NLD posting 
term, an e^^ror message is generated. 

These error messages highlight the^ additions, modifications, and 
deletions required in the Phrase Matching file. 
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DTICVSAM This command is used to- create a DTIC Thesaurus authority file 
from LEX.POSTTERM.DTIC, a list of posting terms from DTIC. 
LEX.POSTTERM.DTIC is a sequential 'file created from J)TIC*s 
Thesaurus tape and so far updated manually by NLD maintenance 
personnel using online editing capabilities. Each time 
LEX.POSTTERM.DTIC is updated, a new VSAM. authority file must be 
created with the DTICVSAM command. 

VALSETUP This command creates two authority files for NASA Thesaurus terms 
from the online Thesaurus files: 

• A sequential file of NASA posting terms and Use 
references. This file is used by the validation 
routine to check that there is an entry in the Phrase 
Matching file for every NASA Posting term and Use 
reference. 

• A VSAM file of NASA posting^ terms only: 
'NLD. THES. TERMS* 

The VSAM file is used by NASAVAL to verify that all posting terms 
appearing in the posting term field of existing entries in the 
Phrase Matching file are valid NASA posting terms, and by DTICVAL 
for the same purpose in the Subject Switching file. NASAUPDT, 
- NASALOAD, DTICUPDT and DTICLOAD use the NASA VSAM authority file 
for validating new transactions being added to the data files. 

As the VSAM file is being created, each term is checked against 
the Phrase Matching file ( 'NLD. NASA. MASTER ' ) to determine if it 
should be marked as an array term and to add the 0 to the term if 
required. To look at a term in the NASA VSAM authority file 
'NLD.THES.TERMS*, use the PRINT IDS command. 

PRINT IDS This command allows an online look at any VSAM file record and at 
a user-specified number of additional sequential records. 
DTICFIND and NASAFIND are shortcuts for displaying records in 
master files 'NLD. SSDTIC. MASTER' and 'NLD. NASA. MASTER' , 
respectively. However, to see a record in other VSAM files, for 
example the NASA file 'NLD.THES.TERMS', one must use PRINT IDS. 
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Printout Review 



Following execution of a command, any printout that has been generated 
is examined by NLD personnel . This is to see: 

t Whether or not the job has run satisfactorily. 

• Whether or not there are any errors that must be corrected. 

The Facility NLD Maintenance Manual lists step-by-step instructions 
for recognizing and correcting errors from each listing. 

In general, errors listed in -printouts generated by any of the 
commands are listed under a heading that implies what the problem is and 
■how to fix it. 

A sample of error messages listed on printouts generated by NLD 
maintenance commands follows: 

•KEY (unmatched element) OF (key of rejected transaction) IS NOT FOUND 

The transaction has been rejected because the specified element of the 
key does not match any entry in the input posting term authority file. 
The non-match may be the result of: 

1. misspell ing in the transaction, 

2. failure to separate multiple elements of the key with 
semicolons, or 

3. an error in the input posting term authority file. 

If the error is of types 1 or 2, correct the error in the "appropriate 
LEX. .MOD. file before re-executing the load. If the error is of 
type^ use the online edit capability of the data base management 
system to correct the error in the corresponding LEX.POSTTERM. 

file, and execute the ^VSAM command to. recreate a corrected input 

posting term authority file. Then the Load commafid may be 
re-executed. 

POSTING TERM (unmatched posting term) OF (entire posting term field of 
rejected transaction) IS NOT FOUND 

The transaction has been rejected because the specified element of the 
transaction's posting- term does not match any entry in the NASA 
posting term authority file. The non-match may be the result of: 

1. misspelling in the transaction, 

2. leaving out a required 

3. including an incorrect 

4. failure to separate multiple posting terms with commas, or 

5. an error in the NASA posting term authority file. 
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If the error is of types 1 through 4, correct the error in the 

appropriate LEX. .MOD file before re-executing the load. If the 

error is In the authority file, (type 5), notify the lexicographer. 
The error must be corrected in the RECON online Thesaurus and a new 
NASA posting term authority file created by executing VALSETUP before 
the Load command can be re-executed. 

INVALID LOGIC CODE IN RECORD (rejected transaction) or NO LOGIC CODE 
IN RECORD (rejected transaction). 

The transaction has been rejected because the logic code was missing 
or was incorrect. Correct the logic code in "the appropriate 
LEX. .MOD file and re-execute the Load command. 

ELEMENTS IN KEY NOT IN ALPHA ORDER (key of rejected transaction) 
In a Subject switching file, the elements of the key must be in- 
alphabetical order. Correct the key in the appropriate LEX. .MOD 
file before re-executing the Load command. 

Errors in a LEX. .MOD file are corrected using the online edit 

capabilities of the data base management system. When the file is 
corrected, it is loaded into the appropriate data file by entering 
DTICLOAD, or NASAL OAD. ^ J' a 

In a DTICVAL printout, there may be a page headed UNMATCHED KEYS. 
This contains error message for all Lexical Dictionary entries that 
contain in the key a term that is mt matched on the DTK Thesaurus 
authority file. The job prints out both the erroneous term and the 
entire record containing the erroneous term, in the following format: - 

KEY NOT FOUND = CLARK 

OF RECORD = T CLARK;DUKE SUPER PROGRAMMER 

If the Lexical Dictionary key is erroneous-, delete the record and add 
a corrected entry to the file if necessary. In some cases, these 
errors are due to errors in the DTIC Thesaurus authority file. If an 
authority file error is located, fix the error in ' the 
LEX. POSTTERM. DTIC file using the online edit capabilities of the data 
base management system and then execute DTICVSAM to create a new DTIC 
Thesaurus authority file. 

The page headed UNMATCHED POSTING TERMS contains an error message for 
each Lexical Dictionary entry containing a posting term which. is not 
matched in the NASA Thesaurus authority file. The job prints out th.** 
errors in the following format: 

Logic Key Posting Term Error Message 

Code 

For example: 

T Clark';Duke Super Programmer Key not found-Super Programmer 

Next record is Supercavitating 
Flow 
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In this example, the program could not match the posting term in the 
authority file at all. (this is a test record added to the file by 
said programmer.) As an aid to the analyst, the programmer prints out 
the record appearing in the authority file following the place where 
the unmatched posting term should have jippeared* When the error is a 
typographical error in the Lexical Dictionary key, the *'next record" 
is frequently the correct spelling of the NASA Posting term. 

C Power Equipment;00 Electric Equipment? Key not found- 
Electric Equipment© 
Found Electric 
Equipment without @ 

In this example, the program could not exactly match the Lexical 
Dictionary posting term Electric EquipmentO in the NASA Thesaurus 
authority file, but did find Electric Equipment (the same term without 
the "@"). Check the term in question in the NASA Thesaurus. 

If the term is not an array term, .correct the entry in the Lexical 
Dictionary by deleting the @ from the posting term in the entry using 

the ^UPDT procedure. See Changes to an Existing Record subsection 

of "Maintenance Functions." 

If the term is an array term, the error (a missing "@") is in the NASA 
Thesaurus authority file. The VALSETUP program which creates the NASA 
Thesaurus authority file adds the to terms if an "@" appears 
following that term in the NASA Phrase Matching file. Therefore, if 
this type of error is located in the NASA Thesaurus authority file, it 
means that there is an error in the NASA Phrase Matching file for tliat 
term. The Phrase Matching entry is located in the 'NLD. NASA. MASTER' 
file using NASAFIND. 

Using the NASAUPDT procedure, the @ is added to the posting term, and 
VALSETUP is executed to create a correct NASA Thesaurus authority 
file. 

T Flight;Stresses Flight Stress Key not found-Flight stress 

Found Flight Stress ^ 
with 0 

V 

In this example, the program could not exactly match the NLD key 
Flight Stress in the NASA Thesaurus authority file, but did find 
Flight Stressi?. Check the term in question in the NASA Thesaurus. 

If the term is not an Array term, the error is in the NASA Thesaurus 
authority file. See the explanation in the preceding example for how 
to correct this type of authority file error. 

If the term is an Array term, the error is in the DTIC/NASA Subject 
Switching file and can be corrected by adding the "@" to the posting 
term in the NLD entry. 



- A page headed UNMATCHED DTIC TERMS lists an error message for every 
term which appears in the DTIC Thesaurus authority file for which no 
entry appears in the DTIC/NASA Subject Switching file. The errors 
appear in the following, format: ^ ' 

TERM NOT FOUND = (unmatched term) 

Check the unmatched term in the most recent DTIC Thesaurus supplement. 

If the term, is a valid DTIC termJ determine the correct NASA 
translation, create a new Subject Switching file record for the term, 
and add the record to the Lexical Dictionary file using DTICUPDT. 

If the unmatched term is not a valid DTIC term, the error is in the 
authority file. Correct or delete the incorrect entry in the manually 
maintained DTIC Thesaurus file LEX.POSTTERM.DTIC. Execute the 
DTICVSAM command to create a corrected DTIC Thesaurus authority file. 

Other error messages may report on. the status of: 

NASA THESAURUS VALIDATION 

THESAURUS TERMS-NO NLD KEY 

THESAURUS TERMS - NO NLD POSTING TERM 

READING FOR ATSIGN AND KEY NOT FOUND (term) 

If there are problems that the NLD maintenance personnel cannot 
correct, the applicetion programmer assigned to the NLD must be consulted. 
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RESULTS AND CONCLUSIQNS 



Appl ication's; 



A lexical dictionary can be used for any application that requires 
translation of input phrases to phrases in a target vocabulary. As of 
January 1, 1984, NASA was using the NLD for three applications: 

• building Subject Switching capabilities 

• processing DTIC (TAB) tapes, and 

?• processing DOE (EDB) t^pes. 

Each of these applications uses the NLD system in a different way. 
Building Subject Switching capabilities uses the Phrase Matching mode and 
accepts all matches, complete or partial, from the NLD system. DTIC TAB 
tape processing uses both the Phrase Matching and Subject Switching modes, 
and accepts only complete matches from the NLD System. DOE EDB tape 
processing used only the Phrase Matching mode (until the DOE/NASA Subject 
Switching file became operational) and accepted only complete matches from 
the NLD system. . 

As this is being written, two other applications are in the programming 
stage: 

. • Using the Phrase Matching file to process Library of Congress 
MARC records and accepting complete or partial translations. 

^« Using the Phrase Matching file to process natural language 
phrases automatically extracted from abstracts or other text and 
acceptinry complete or partial translations. 

\ 

■Benefits 

5 

Benefits obtained from the use of the NLD were measured with the least 
possible disruption to the indexing process. The hypothesis upon which the 
NLD was authorized was that the NLD would increase the indexers* 
productivity and reuse the Indexing already done by DTIC. It was intended 
that. the quality of the indexing would remain high. The following analyses 
shows that our samples were adequate, that the results are significant, and 
that we have proven our hypothesis. 

Evaluation Methods . The evaluation of the NLD was based on a 
comparison of the preliminary subject analysis study done for December 

1982 through March 1983 with a post-implementation study don^ for Decfeuiber 

1983 through March 1984. 

Study number 1 Included confidential interviews with each indexer; all 
were conducted by the same interviewer to ensure consistency. In addition, 

♦ 
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a sample 100 documents was selected from a single DTK TAB tape, and the 
results of Subject Switching for these documents were analyzed. These 
documents were taken from all categories represented on the tape. Since 
there were fewer than 100 categories, multiple selections were made from 
some categories in approximate proportion to the number of documents 
assigned to the more populous categories. 

Study number 2 utilized a questionnaire because there was concern that 
observed time studies would be intrusive and slow production. Indexers, 
without consulting with one another, filled out their questionnaires 
simultaneously. In addition, a representative sample of 150 DTIC 
documents, drawn over a three-month period, was analyzed. 

Comparisons . See Figure 18. Although study 1 had a sample of 100 
documents, two of the DTIC posting term values were discarded as being too 
deviant leaving a samplecrsize for DTIC of 98. The following table shows 
some of the comparisons made. 



Study 1 (Pre) 
DTIC NA^A 



N 
X 



98 
14.32 



100 
9 



4.88 



Documents in sample 

Mean of term assigned 

Standard deviation where 
2 = the sum and X = 
the deviation from the 
mean 



Variance 



Standard error of mean 



2 

=~FT~ 23.81 



Study 2 (Post) 

DTiC NASA 

250 250 

.59 13.09 12.60 

.03 4.73 2.05 

4.11 22.35 4.19 

.20 , .30 .13 



The standard error is -small. Since the "t" test (used to ascertain 
the deviation of the estimated mean from the mean of the population) gives 
us a value which is off the t chart but indicates a better than 99% 
confidence level, we conclude that our 'Samples and the results of our 
comparative study (shown above) are valid for the entire population." 

It is interesting to note that before using the NLD, there was 
considerable difference in the average number of index terms assigned by 
the two agencies: 14.32 to 9.59. Study 2 shows that the averages are now 
very close: 13.09 to 12.60. 

Access Points and Productivity . The increase in the number 'of NASA 
index terms assigned to a document, as indicated in the above table, not 
only signals increased productivity, but also increases the number of 
access points to a document. Evaluations of document retrieval have 
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dumber of 
Documents 



Figure la * 
A Comparison of DTIC and NASA Indexio^ 




DTIC Preliminary Study Sample — 
Post Implementation Sample- 



8 9 10 11 12 13 14 15 1617 18 19 20 21 22 2324 25 26 272829 3a31 32 3334; 
Number of Index Terms per DocJment 



NASA Preliminary Study Sample 

^'^'^st Implementation Sample———* 
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indicated tliat these changes have not affected retrieval adversely. The 
pertinency level of retrieval has remained high throughout the introduction 
and use of the NLD. 

^ Time Savings. Two concerns in the field of library and information' 
science are' the ever-growing amounts of material to be classified, stored, 
and disseminated, and a constant need to do more work for less or the same 
amount of money. Information scientists ,are looking for ways to get 
information to the user more quickly. We feel that the NLD is making a 
contribution m this area. 

% 

Min °f indexers reported that having index terms provided by the 
j;iLD makes indexing DTIC docuinents faster. The remaining indexers indicated 
chat having the suggested NLD terms has no effect on their speed. 

.Indexers were .asked to estimate the time saved by having NLD terms. 
The average was 5.4 minutes per document. See Figure 19. Indexers then 
were asked to estimate the time required to index. a DTIC document with NLD 
terms provided. The average of these estimates was 10 minutes. When this 
figure was compared with the study 1 (pre-NLD) average indexing time of 13 
minutes, a 3 minute difference was noted. The predicted savings per 
document was;^ 2 to" 3 minutes. .Based on the indexers' estimates, the 
intended goal has been reached and may have been exceeded. This time 
savings obviously speeds up the document turnaround time and can increase 
the timeliness of the product. 

^. Changes in Work Emphasis . As an indexer tool, the NLD has relieved 
the indexers of having to look up many terms in the thesaurus. The correct 
form IS presented for use or for deletion. In place of this rather 
mechanical task, indexers are asked to watch for coordinations of DTIC 
terms that should be translated to single NASA terms. This process should 
result naturally from the review of the indexing terms presented by the NLD 
printout, and the change in emphasis can provide more challenge to the 
index?r s job than just looking up correct forms of ter.ns. 

Shared Resources. Reindexing work that already has been done at 
taxpayer expense is wasteful of government resources. The original and 
primary purpose of the NLD was to utilize indexing done by other agencies. 
The sharing of indexing with DTIC also has brought about sharing of some 
programming and improved quality in the thesauri and lexical dictionaries 
of both DTIC and NASA. 

Stepping Stone. The Lexical 'Dictionary has been found to be a 
stepping stone to other endeavbi^s. Its Phrase Matching capabilities are 
being expanded and will soon be used to add NASA terms to MARC records 
The NLD IS also a way of approach to machine-aided indexing of abstracts or 
other text. 

Problems 

Different Indexing Philosophies . Indexing philosophies differ from 
agency to agency. This difference mst be addressed in translating 
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Estimate of Time SajVed 
(7 indexers replied) 



3 minutes saved per document 
2 

10 
4 
3 
1 

15 

SS" 7 =5.4 minutes average 
Estimate without the outliers 



3 

2 
,4 
3 
1 

l7 4 5 =2.6 minutes 



Pooled estimate of time saved 

5.4 
2.6 

WJS T- 2 = 4 minutes per document 

Estimate of time to index 
(10 indexers replied) 

6 minutes per document 
8 

7.5 

20 

4 
15 

7 

10 

12.5 
10 

lOO' -T 10 = 10 minutes per document 



Figure 19 
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concepts. For example, one agency may index to "Ablative nose cones'* and 
another to "Ablation" and "Nose cones". The first is precoordinated, i.e, 
the two concepts are joined in the index term. The second is 
post-coordinated, 1,e. coordinated at the time of retrieval. 

Some agencies include in their indexing any broader terms that appear 
in the hierarchy of the term used. For example, if the term used 's 
"Hafnium-alpha" the indexer or the system would assign all of the following 
terms: Hafnium, Refractory metals, Metals, and Elements. Another system 
would assign only the -most specific term that appFies. If the document 
were on the subject of Hafnium-alpha, they would index to that only, or if 
that term were not available, to Hafnium, 

These differences must be taken into account in setting up a lexical 
dictionary system. 

Chemical Compounds, Complexes, Metal Alloys > etc. . If you can't match 
a chemical compound or complex or an alloy term exactly, you may expect 
trouble with translations and retrieval. Coordinations of terms in this 
area of knowledge are likely to produce unwanted citations in retrievals 
It is important to know how an indexer is instructed to handle these 
concepts, - 

Semantics and Scope > The term "Performance tests" in DTIC's 
environment has no restrictions in meaning. In NASA's environment, this, 
term applies only to operating equipment. Another agency has the term 
Blowouts and defines it as "high pressure. . .ejection of water, gas, or oil 
from a borehole"; in NASA's Thesaurus, Blowouts is related to Tires and 
Fatigue life. These are homonyms, two terms that match character for 
character, but convey different concepts. Every term must be examined as 
to its scope and meaning in both the input and target environments. 



Recommendations 

Automat i on ♦ Automate the initial entries, the continuation entries, 
and use online editing^ It will keep the manhours needed for data entry 
and for error correction to a minimum* 

Indexing Policies and Vocabularies . Become familiar with the indexing 
policies and vocabularies of both organizations: the one contributing, and 
your own, the target. 
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SUMMARY 



"Problem and Proposed Solution 



Because the NASA STI Facility and the Defense Technical Information 
Center have overlapping interests, they share infomtation. twenty percent 
of the NASA data base was previously indexed by DTK. Most of the 
documents received from DTIC are on microfiche' accompanied by a magnetic 
tape that provides DTIC's cataloging, abstracting, and indexing in 
machine-readable form. Management proposed that the Facility automatically 
translate DTIC's posting terms to NASA's terms in order: 

• To avoid the reindexing that was necessary to adapt the 
information to the NASA system and 

• To save indexing time. 



Construction of the NLD 



The NASA Lexical Dictionary was constructed in four phases. 

Phrase Matching , Phase One centered on constructing a file consisting 
of entries for every posting term and Use reference In the NASA- Thesaurus, 
as well as additional Use references constructed specifically for the NLD 
System. Most of the programs were written in this phase also. The Phrase 
„ Matching mode attempts to find word-by-word matches between any input 
phrases and NASA terms or Use references. Matches may ie complete or 
partial. For example: ^ 

Input-Any Terms Output-NASA Terms 

Gold « Gold 

Gold plate Gold coatings 

Gold pfated ' , Gold coatings 

Gold plated chassis Gold coatings, chassis* 

Gold-plated chassis ' Chassis (because gold-plated, with 

a hyphen, has not been added to the 
file yet) 

Subject Switching Individual Terms . Phase Two consisted of the 
construction of a translation table between the DTIC Thesaurus terms and 
NASA's. Entries in the file pair each DTIC term with one or more NASA 
^erms that best express the same concept. For example: 

Input-DTIC Terms Output-NASA Terms 

Anti Fogging Agents Fog Dispersal 



74 



86 



Input-DTIC Terms 



Output-NASA Terms 



Antioxidants 
Apogee 
Approach 
Architects 
Area Bombing 
Area Coverage. 



Antioxidants 
Apogees 

Approac!i+- 

Architecture, Personnel 

NIS (meaning Not In Scope) 

00 (meaning no equivalent concept) 



Subject- Switching Coordinates . Phase Three completed the Subject 
Switching file by adding entries of coordinated DTIC terms translated to 
one or more NASA terms that express the same concept. For example: 



Input-DTIC Terms 
Angles; resolution 
Antennas; Gravity waves 



Output-NASA Terms 
Angular resolution 
Gravity wave antennas 



~ Feedback and Maintenance . Phase Four%as concerned with user feedback 
and file maintenance. New terms added to either the DTIC Thesaurus or the 
NASA Thesaurus requirs add.itions and modifications to entries in the data 
files. In addition, users can suppl^' feedback as to translations that 
■should be added or modified. 



Results 



In the NLD, the NASA STI Facility has a system that translates words 
and phrases from input material in-to equivalent concepts expressed in NASA 
posting terms. The system was designed particularly to allow the reuse of 
DTIC indexing in the NASA environment. According to a study of 250 DTIC 
documents, 89 percent of the terms assigned to DTIC documents by NASA 
indexers now are suggested by the NLD". The Facility also saves an 
estimated 3 minutes 3f indexing time per document. 

While translating DTK's index terms to those of the NASA Thesaurus, 
the NLD has preserved the quality of NASA's indexing. Also we know of no 
other system that differentiates concepts that are expressed by homonyms, 
or 'that coordinates terms in one vocabulary and translates them to the same 
concept expressed in the different terms of aftot[ier vocabulary. The NASA 
Lexical Dictionary system is not only operating, but it is doing so with 
considerable success. 
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GLOSSARY 



Access 
Routine 



appl i cation 
program 



continuation 
entry 

continuation 
symbol 

controlled 
vocabulary 



DOE 

DTK 

element 



A general purpose computer program that accesses the NLD 
files. The Access Routine never operates independently; 1t 
is always called by ait application program, 

A program that passes to the Access Routine two parameters: 

• a cude — uTBt indicates whether Phrase Matching or 
Subject Switching mode should be employed, and 

• a character string that is either a word or phrase 
for Phrase Matching, or a set of posting terms 
assigned to a citation by a contributing source for 
Subject Switching. 

An entry that tells the computer to continue to look for 
additional key elements in order to reach the posting term. 

The one or more asterisks (*) or percent signs {%) used in 
the posting term fi#ld of continuation entries. 

Terms that are authorized by an organization for their 
indexers to use in listing the subject matter of, or the 
concepts contained in, a document; a list of posting terms 
acceptable to the system and available for use. 

Department of Energy . ' 

Defense Technical Information Center 

An element is part of the key to a record. In the Phrase 
Matching file, each word of the input phrase is an individual 
element. In the Subject Switching file, each contributina 
source posting term (which may be single or multiple words) 
is an element. 



exception 
listing 

gloss 



key 



A list of input posting terms 
translated by the NLD. 



which cannot be matched and 



A parenthetical expression used to clarify the meaning of a 
posting term^ or Use reference tha>t o^iherwise might be 
ambiguous. For example: 

LOX (oxygen) 

Pitch (inclination) 

Pitch (material) 

The subject of and a unique field in an NLD record. The key 
consi sts of terms that may be encountered -i n the input 
material. The key can consist of a single element, followed 
by a semicolon and two zeros {;00) or multiple elements 
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4 



LDICT 



GLOSSARY (Continued) 
separated by a semicolon 
Lexical Dictionary. 



LEX.POSTTERM, 
DTIC 

logic code 
major terms 

minor^Wms 



NASA 
NLD 

posting ternis 



Phrase 

Matching file 



Phrase 

Matching mode 



- A sequential file of DTIC posting terms, 



A one character code, which indicates how the key is to .be 
processed. 

Posting ternis that, in the judgment of' the indexer, express 
the major concepts and reseiarch areas of a document. Major 
terms may be used for online searching and retrieval 6r to 
generate printed indexes. See also Minor terms. 

Posting ternis that, in the judgement of the indexer, indicate 
minor concepts and areas of interest in the information 
presented in a document. Minor terms encompass such aspects 
as properties, characteristics, action determined, relevant 
conditions of the investigation, measurement techniques, and 
.Instruments or calculations used when these aspects are not 
of primary importance. Minor terms do not appear in 
published subject indexes but may be used for online 
searching and retrieval. See also Major terms. 

National Aeronautics and Space Administration. 

NASA Lexical Dictionary. 

Controlled vocabulary terms that are used by an 
organization's indexers to index documents., for the use of 
that organization. 

A file of NASA terms and Use references which are posted to 
valid NASA Thesaurus terms. The file is used as a general 
purpose translation table.. This .translation table accepts as 
input all of the posting terms and Use reference from the 
NASA Thesaurus as well ■ as additional Use references 
constructed especially for the NLD. 

A general -purpose matching routine which attempts to find 
word-by-word matches between any input phrases and NASA 
posting terms or Use references. 

The Phrase Matching mode can be. used to process any type of 
phrase input. This input can consist of document titles, 
freely assigned keywords, or posting terms from a 
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GLOSSARY (Continued) 



contributing source Thesaurus for which Subject Switching' entries have not 
yet been created ♦ 



PMF 
SSF 
STI 



Subject 
Swi tching 
file 



Subject 

Switching 

mode 



Use reference 



Use for 
reference 



VSAM 



Phrase Matching file. 
Subject Switching file. 



Scientific 
Facil ity. 



and Technical Information, as in the NASA STI 



A file of a contributing organization's authorized vocabulary 
which provides, in the posting term fieUd, one or more NASA 
posting terms that express the same -concept. An entry is 
created' for every contributed posting term, but in some cases 
the translation may indicate that the term is out of scope or 
not able to be translated. Additional entries are created 
for combinations of input posting terms posted to one or more 
different NASA terms. 

A special purpose routine that- translates M;he posting terms 
assigned to a document by a particular contributing source 
(such as DTIC) into NASA posting terms. Subject Switching 
translates the concepts represented by the posting terms, in 
contrast to Phrase Matching which' looks only for word 
matches. A unique translation table, called a Subject 
Swi tchi ng file, is bui 1 1 for the posti ng terms of each 
contributing source. 

A reference from a posting term that is not in the controlled 
vocabulary to one that is. For example: 

Condensation trails use Contrails 

A posting term that is "used for" a term that is not in the 
controlled vocabulary. For example: 

Contrails use for Condensation trails 

Virtual Storage Access Method. VSAM records, stored on 
direct access' devices, may hava fields of fixed or variable 
length and may be processed directly or sequentially. 
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APPENDIX A 

PROCESSING OF INPUT WORDS AND PHRASES BY 
THE NLD ACCESS ROUTINE 

The Access Routine determines the logic code associated with each element in 
the input array from the appropriate data file. If an element is not located 
in the file, a logic code of "?" is assigned for the use'of Access Routine 
processing. The logic code controls the way the element will be processed by 
.the Access Routine. A "T- loyic code indicates that the Access" Routine must 
look for combinations between that input element and other elements from the 
input array. Using "T" logic, a search key is created by adding to the "T" 
element a ";" followed by the next element in the input "array. The Access 
Routine then tries to match this search key with a key in the data file. 

• If the search key matches a file key which translates to a posting 
term, that posting term is returned. 

• If the search key ruatches a file key which contains a continuation 
character (*, **, etc) in the posting term field, then the next 
element from the input array is added to the end "^of the search key. A 
match is again attempted with the file. 

• If no match is found for the search key, then the final element of the 
search key is replaced by the nfext element of the input array and a 
match is again attempted with the file. 

• If all of the elements from the input array are tried without finding a 
match and the lead element has not been used in any .other successful 
match, then a ";00" is added as the final element of the key for a 
final search attempt. 

The "T" logic processing is repeated with each "T" logic element from the 
input array as the first element of the search key. 

When the logic code of an element is anything other than a "T", the following 
logic is followed: 

• If the element has already been used in combination with a "T" logic 
element in a successfully matched search key, then that element is 
skipped. 

0 If the element has not already been used in a successfully matched 
search key, the ";00" is added to the end of the element to create a 
search key that is matched against the file. 

The Access Routine returns all of the matches made on the input character 
string to the application program. 

The following example illustrates a simple case of Access Routine processing 
for the Phrase Matching example given in the section on SYSTEM DESCRIPTION 
under Lexical Dictionary Access Routine . Subject Switching processing is 
basically the same, but the Subject Switching input array would consist of 
contributing source posting terms sorted alphabetically. 
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Input Phrase: Engine endurance testing research laboratories 



Input Array Logic Code ^ 

Engine T 

Endurance E 

Testing T 

Research T 

Laboratories E ^ 

Phrase Matching File Entries for the Words in the Input Array: 

Logic Code Key Posting Term 

E Endurance ;00 ^ Endurance 

T Engine ;Control Engine Control 

T Engine;Design . Engine Design 

T Engine ;Testing * 

T Engine ;Test1ng;Labora tor ies Engine Testing Laboratories 

T Engine ;Tests , Engine Tests 

E Laboratories; 00 /Laboratories 

T Research ;And ^ * 

T Research;And^Development Research and Development 

T Research;Facilities Research Facil ities 

T Research ;00 Research 

T Testing;Machines Testing Machines 

T Testing;Time Testing Time 

T Testing;0O Tests 



Access Routine Processing: 

(References are made to the Input Array and Lexical Dictionary Entries shown 
above.) 

Processing Description Outcome 
Logic code of first element is T. 

Create search key from first two elements. ^ 

Look search key "Engine;Endurance" up in file. Key not found. 

I^eplace final element of search key with next 
element array. Lork search key "Engine; 
Testing" up in file. 

Add next element in the array to the 
search key. Look search key 
"Engine;Testing;Laborafories" up in file. 

Move on to next element in array. 
Logic code of second element is E. 
Second element has not been used 
in any T combination. 
Create search key from second element. 
Look searph key/'Endurance;GO" up in file. 
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.Key found."* Continuation 
symbol returned. 

Key found. Posting term 
"Engine testing 
laboratories" returned 

Key found. Posting term 
"Endurance" returned 
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Pr^ocessinq Description 



-Move on to next element in array. 
Logic code of third element is T. 
Create search key from third 
and fourth element. 

Look search key "Testing ;Research" up in file. 

Replace final element of search key 
with next element in array. 
Look search key "Testing ;Laboratories" 
up in file. 

No more elements remain in array to be tried 
as a final element of the search key. 
"Testing" has already been uaed in a previous 
successful match. 
End processing for "Testing". 

Mpve onto next element in array. Key not found. 

Logic code of fourth element is T. 
Create search key from J;he fourth and 
fifth elements. 

Look search key "Research ;Laboratbries" 
up in file. 

No more elements rem=iin in array to 
be tried as final element of search key. 
_ "Research"' has not yet been "used. 
Create search key be adding ";00" to 
"Research" 

Look search key "Research ;00" up in file 

Move on to next element In array. 
Logic code of fifth element i . E." 
"Laboratories" has already been used 
in a successful match. 
End processing for "Laboratories". 
No more elements in array. 
End processing^ 

The final outcome of the processing is that the input phrase "Engi 
endurance testing research laboratories" is translated into the NASA posti 
terms Engine testing laboratories", "Endurance", and "Research". 




Outcome 
Key not found. 

Key not found. 



Search key found. 
Posting term "Research 
returned. 
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PROCEDURES FOR DETERMINING NASA TRANSLATION FOR DTIC 
TERMS FOR THE SUBJECT SWITCHING DATA FILE 

1.0 INTRODUCTION 

The DTIC/NAS/V Subject Switching capability requires that a file be 
created containing translations from DTIC posting terms to NASA posting 
terms. Because the creation of this file is a significant effort, 
autom?ted methods have been employed to generate aids for the analysts. 
Lists of potential translations are obtained in the following way: DTIC 
terms are run through the NLD Phrase Matching file. The output is sorted 
alphabetically by DTIC terms in one printout and by NASA terms in 
another. NASA terms are run through the DTIC Lexical Dictionary (LDICT). 
This output is also sorted alphabetically by DTIC and by NASA terms. 

Frequent references to these printouts required new nomenclature for ease 
of communication. Since the printouts are divided into four separate 
volumes, they are referred to .as Books and numbered in alphabetical 
order, as follov.3: 

DTIC/NASA sorted alphabetically by DTIC is Book 1. 

DTIC/NASA sorted alphabetically by NASA is Book 2. 

NASA/DTIC sorted alphabetically by DTIC is Book 3. 

NASA/DTIC sorted alphabetically by NASA is Book 4. 

In order to facilitate the use of the information contained in these four 
Books, additional programs re-sort portions of them and generate file^ of 
properly formatted candidate entries for the Subject Switching file. 
From the DTIC/NASA translations., separate files are created for: 

f Exact Matches - DTIC posting terms for which there are exactly 

matching NASA posting terms. and 
t Partial Matches - DTIC posting terms for which there are one or 

more partial phrase matches with NASA posting terms or Use 

references. 

and a printout is created of all: 

f No Matches - DTIC terms for which there were no matches with NASA 
posting terms or Use references. 

From the NASA/OTIC translations a file is created for: 

f Tables - Two or more DTIC posting terms which, used in 
combination, translate to a NASA posting term or terms. 

The record for an entry contains a tentative logic code, a key consisting 
of one or more DTIC terms and a posting term field which consisted of one 
or more NASA posting terms. In the case of the exact matches and partial 
matches, the analysts edit the tentative NASA translations, which were 
generated, by the NLD. In the suggested table entries, the analysts edit 
the keys which are tentative DTIC translations which were generated by 
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the DTIC LDICT, The no match printout is a listing, rather than a file, 
of DTIC terms for which the NLD had no matching entry, either complete or 
partial ♦ 

Analysts study these potential entries and handle them according to the 
following set of guidel ines covering both general procedures and 
procedures which are unique for each type of entry. 

2>0 GENERAL PROCEDURES 

2.1 RECORD FORMAT 

The record format for the computer generated candidate entries for the 
DTIC/NASA Subject Switching file is as follows: 

Logic code 

A one character code which is entered in the second column of the 
record. ^'The .logic code provides information for Access Routine 
processing and describes the relationship between the key and the 
posting terms* 

Logic codes may be one of the following: 

E Th^ single term key and the posting term are equal. 
C There is a change between the single term key and the single term 
posting term. 

L The single term key translates to a list of posting terms. 

I Th^ translation of the single term key is indeterminate and a 

choice is offered to the indexer. 
0 There is no NASA translation for the single term key. 
T • Th4re are multiple terms in the key. 

Key 1 

The key Ibegins in column 4 and consists of one or more valid DTIC 
posting terms* if there is only one DTIC posting term in the key, it 
is followed by *';00."/ If there are multiple posting terms in the key 



they are 



any term 



separated by semicolons. The key is followed by a which 



separates the key from the posting term. Parentheses are removed from 



in the key. 



Posting ijerm 



The posting term begins following the key and $. The posting term 
consists of one or more NASA posting terms in exactly the same format 
in-which th;^y appear in the NASA Thesaurus^ If there are, multiple NASA 
posting tjerms, they are separated by commas. 

Note: Ert tries which are manual ly«?coded for data entry using the online 
maintenance software follow the format given above with two^exceptions. 
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1. A dollar sign is placed between the logic code and the key, 
as^well as between the key and the posting term. No spaces 

• are left between the logic code and the key. For example: 

E$Radar;00$Radar 

2. No space is left before the logic code, because the online 
maintenance software will automatically place the logic code 
in the correct col-umn. 



2.2 USE OF SYMBOLS 

Four symbols may be used with the NASA Thesaurus terms in the posting 
term field, to indicate the following conditions. 

+ NASA has narrower terms to the suggested term, which are not 
covered in- the DTIC Thesaurus. The indexer should look at the 
narrower terms to see if they are appropriate for use with the 
document in hand. 

0 The suggested NASA term is an Array term. 

> The suggested NASA term is a broader concept than the DTIC term. 
? The indexer must select the NASA ' term ■ or terms that are 
appropriate from the suggested terms. 

Special rules for symbols: 

1. When the "?" is used, all suggested NASA terms should be followed 
by a '?". No more than three terms should be suggested in an 
indexer choice entry. 

2. The ">"■ should be used when the suggested NASA term is broader 
' than the DTIC term. However, the ">" is not used when switching 

form DTIC terms incorporating broad concepts sjjch as "methods", 
"systems",..." equipment", etc. to a NASA' term for the general 
subject. For example: 



The ">" is also not used when a coordinated list of NASA terms is 
suggested as a translation, even if the coordinated terms are broader 
then the DTIC terms. 



DTIC 



NASA 



Fire Control Equipment 
Adaptive Control Systems 



Fire Control 
Adaptive Control 
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2.3 OTHER GENERAL GUIDELINES 



1. If the NASA Thesaurus does not contain an exact match, then thp 
name of a discipline, the product of the discipline, or the 
instrument used in the discipline may be used interchangeably as 
translation. No ">" or is added. For example: 

• Holography, Holograms . * 

2. If the NASA Thesaurus does not contain an exact match, then the 
noun form and the gerund form of a term may be used 
interchangeably. No ">" or "?" is added. For example: 

Couplers, Coupling 

3. If the source vocabulary has only one form of a teriTii ^ind NASA 
has more than one, then all of the NASA variants should be 
presented followed by "?", as an indexer choice. For example: 

^ Estimates ;00$Estimates?, Estimating? 

4. Consistency should b.e maintained between similar switches. Check 
other entries that 'have been coded and try to follow the same 
pattern when a similar switch is encountered. For example: 

Jrigade Level Organizations, Platoon Level . Organization, etc. 
NIS - . • 

5. If the NASA Thesaurus does not contain ap exact match, but does 
have the opposite, then the opposite Is used as the translation. 
No">" or "?" is added. For example: 

-Antijamming, Jamming 

6. Avoid the use of Array terms when possible, but use Array terms 
rather than translating a term to "00". If three or fewer "?" 
terms can be substituted for an Array term, do so. For example: 

NOT H Ballast;00$B5llast(a " 

BUT til Ballast;00$Ballast(Mass)?,Ballasts(Impedances)? 

7. Geonraphical terms should -be translated according to the 
fo't owing rules. The rules are listed in order of priority: 

Rivers 

a. Use the specific "River" term if it is available. 

b. If a "basin" term is available for the river, list both the 
"basin" term and "Rivers" followed by "?" as an indexer 
'choice. 

c. If the river belongs in only one country or state, list 
"Rivers" and the country or state term. 

d. If the river is in the United States and belongs in more than 
one state, list "Rivers" and "United States." 
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e. If the river belongs in more than one country, list "Rivers" 
and the continent name. 

Islands 

a. Use the specific island term if it is available. 

b. If the island is part of a larger islsnd group for which 
there is a term, use the broader group term. 

c. Use the "Islands". and the body of water in which the islands 
are located, unless a narrower combined term, such as 
"Pacific Islands" is available. 

d. Use the term "Islands" and the name of a country o.ily if the 
. island is" both owned by and adjacent to the country. 

Cities, Towns, etc. . 

a. Use'the specific city term if it is available. 

b. Use the term "Cities" and the state or country in which the 
cUy is located. 

Seas 



a. Use the specific sea term if it is available. 

b. Use the term "Seas" and the country or continent in which the 
sea is located. 

c. If t'lere is >no single appropriate country or continent, use 
the term "Seas" with a >. 

8. Limit a list of posting terms to three terms - two, if possible. 

9. Post any term to 00 (zero zero) in preference, to a poor 
translation or one of questionable accuracy. 

3.0 PROCEDURES FOR NO MATCHES 

No computer generated entries are available for No Matches. These 
entries are manually coded and entered, using the online maintenance 
software. Because of this different entry method, an additional "$" is 
placed between the logic code and the key for these entries. 

1. Look for the DTIC term in the NASA Thesaurus, Volume 2: Access 
Vocabulary. This will locate any NASA term that is a format 
variation of the DTIC term, such as singular or 
plural, term inversion, or hyphenated form. It also will locate 
any NASA terms of which the DTIC term is part. If a variant form 
of the DTIC term is found, assign a logic code of C and enter the 
NASA term in the posting field. For example: 

C$Abandonment;00$Escape (Abandonment) 

C$Acetones400$Acetone 

C$Acoustooptics;00$Acousto-optics 
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2. If no helpful information is found in the Access Vocabulary, then 
look up the DTIC term in the DTIC Thesaurus to determine if any 
broader terms are listed for that term. If there is a broader 
DTIC term, look up the broader term in Volume 1 of the NASA 
Thesaurus to see if that term exists. Examine the hierarchy of 
that term evaluating all narrower end related terms, for a 
possible translation of the original DTIC. term. If the best 
translation of the DTIC term is a broader NASA term, assign a 
logic code of C and enter the NASA Broader Term followed by a 
greater than sign (>) in the posting term field. For example: 

C$Acetonitrile;00$Nitriles> 

3. If approaches 1 and 2 produce no translation for the DTIC term, 
check dictionaries for synonyms or related terms to the DTIC 
term* and look up these terms in Volume 1 of the NASA Thesaurus. 
If 7t is^ determined that two or more NASA terms are required to 
express the meaning of the DTIC term, assign a logic code of L 
and enter the NASA terms in the posting term field, separated by 
commas. For example: 

L$Adamantanes ;OO$Agents0, Curing 

4. If the DTIC term has two or more equally valid NASA term 
translations, assign a logic code of 1 and enter the possible 
NASA terms in the posting term field following each NASA term 
with a question mark and separating terms with a comma. For 
example: 

I$Automata;00$Automatfc control?, Automata theory? 

5. If no appropriate NASA translation can be found for the DTIC 
term, assign a logic code of 0 (zero) and enter NIS in the 

' posting term field if the concept is Not In Scope for NASA, or 
enter 00 (zero zero) in the posting term field if the concept is 
in scope for NASA but no term is available to express the 
concept. For example: 

0$Attorneys;00$NIS ^ 
0$Autumn;00$O0 

4.0 PROCEDURES FOR PARTIAL MATCHES 

Procedures for Partial Matches are the same as for No Matches, except 
that computer generated candidate entries suggest one or more NASA terms 
as a possible translation. These suggested NASA translations serve as a 
starting point for research in the NASA Thesaurus. Edit the posting term 
field of the computer generated entry by changing, adding, or deleting 
NASA posting terms abased on research in the Thesaurus. Change the 
suggested logic code if it is required by changes made to the posting 
term field. 
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5.0 PROCEDURES 'FOR EXACT MATCHES 



!• Look up the Exact Match term in both the OTIC Thesaurus and the 
NASA Thesaurus, Volume 1. Check to see whether the NASA and DTIC 
terms appear to express the same concept by checking broader, 
narrower, and related terms as well as the scope notes. If the 
terms express the same concept, go on to procedure 2. If the 
terms do not express the same concept, determine whether or not 
there is another NASA term that does translate the DTIC term. If 
there is no better translation, go on to procedure 2. If a 
better translation is available, assign a logic code of )5C and 
post to the different NASA term. 

2. Check both Thesauri to see if the NASA term has narrower terms 
that are not in the DTIC Vocabulary. If so, then place a ^lus 
sign (+) after the NASA term. This indicates that the induxer 
should consider the NASA narrower terms. 

t5E Bolts;00$Bolts+ 

3. Look up the DTIC term in Book 3, NASA/DTIC sorted by DTIC, in 
vorder to see if multiple NASA terms are translated into the same 
DTIC term. If there are multiple, valid translations, then the 
following type of entry will be created: 

til DTIC term$NASA term 1?,NASA term 2?" 

to indicate that the given DTIC term could have been used to 
express either of the concepts represented by the NASA terms 
listed with a question mark. The indexer must select the 
appropriate NASA term. 

4. If the NASA term sometimes expresses the same concept -as the DTIC 
term, and sometimes it does not, and if no better translation has 
been found, then assign a logic code of )4I and enter the NASA 
term in the posting term field, followed by a question mark to 
indicate that the indexer must decide whether or not this term is 
appropriate for the document being indexed. 

)5I Performance tests ;00$Performance tests? 

(The NASA term applies only to operating equipment.) 

5. If the DTIC and NASA terms express the same meaning, and if there 
is only one entry for the DTIC term in Book 3, then the logic 
code remains )4E and the posting term will be the NASA term 
listed. 

H Europe;00$Europe 

6.0 PROCEDURES FOR DTIC TtRM COORDINATIONS (TABLES) 

1. DTIC terms in the key MUST be- in alphabetical order. This is a 
requirement of the Access Routine. The term order within keys in 
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the computer generated entries will be correct. 



3. 



4. 



5. 



6. 



7. 



If the translation presented appears reasonable, it should be 
accepted exactly as it is. For example: 

i5T'Abrasion;Resistance$Abrasion resistance 

If other combinations of DTIC terms come to mind which provide 
equally good or better translations of -the same posting term, 
code it or them for entry. Remember that the DTIC terms in the 
key MUST be in alphabetical order 

)(5T Abrasion;Wear resistance$Abrasion resistance 

If multiple generic levels of the same concept appear in a 
suggested table, two entries should be made into the NLD. One 
will include the multiple levels of the concept, as suggested. 
The otheir will include only the most specific term. 

)(5T Acids ;Ascorbic acid;Metabol ism$Ascorl>4<>>^id metabolism 
add: >5T' Ascorbic acid;Metabol ism$Ascorbic aVid metabjyfism 



Note: Do not create tables with multilevel terms; or' ' use 
multiple terms in the same hierarchy in a table if they a^Mear as 
suggested entries. 

Lf the broader generic term is the final term or if any table is 
imbedded in another table, the key in the shorter entry must have 
;00 as the final segment.^ , 

)(5T Acoustic waves;Excitation;Waves$Acoustic excitation 
^T Acoustic waves ;Excitation;00$Acoustic excitation 

If more than two DTIC terms are indicated in a suggested table, 
examine the terms for the most pertinent expression of the 
concept in the fewest possible terms. 

)(5T Aircraft rockets;Fins;iFolding fins;Rockets;Vehicles$Folding 
fin aircraft rocket vehicle 

Fins and rockets as individual terms in the key are unnecessary 
aiid should be eliminated. 

Check all DTIC terms in the suggested table for possible narrower 
terms that would be appropriate substitutions. If an appropriate 
substitution is found, code an additional table using that term. 
Remember: DTIC terms in the key MUST be in alphabetical order. 

jiT Atmospheres ;Spacecraft cabins$Spacecraft cabin atmospheres 
add: jit Controlled atmospheres ;Spacecraft cabins$Spacecraft cabin 




atmospheres 
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If a translation appears to be inappropriate, look for a 
substitute term or terms to convey the concept of the NASA term. 

01 Aircraft cabins ;S,imilators;Spacecraft$Spacecraft cabin 
simulators 

Change to: )iT Simulators ;Spacecraft cabins$Spacecraft cabin 
simulators 

A t in the computer generated entry indicates that the entry came 
from a NASA Use reference. Look up i references in the NASA 
Thesaurus, Volume 1, to . find the pertinent "Use For" (UF) 
reference. If the DTIC terms suggested are a reasonable 
translation of the UF reference, use the suggested table. If the 
terms are not a reasonable translation for that UF neference, 
delete the entry. 

)iT Aircraft ;Stars;Warning 'systems$EC-121 aircraft/i 

The UF reference applicable is "Warning Star aircraft". Since 
this suggested table does not translate the concept of "Star 
aircraft" by "stars" and "aircraft", this entry should be 
deleted. 

10. Delete suggested tables posted to specific NASA terms for space 
vehicles, programs, projects, etc. which may be considered 
identifiers by DTIC, or which have no valid equivalent in DTIC's 
vocabulary. For example, delete: 

)iT Antisubmarine ammunition ;Engines ;Underwater rockets$ 
ASROC engine 

)iT Artificial satellites;Biomedicine$BESS (satellite) 

11. When a suggested table indicates that DTIC expresses a NASA term 
for a chemical compound as two terms, consisting of a chemical 
element and a complex, add the word "compounds" to the element if 
DTIC has the element-compound term. 

Computer generated entries: )iT Acetates ;Lead$Lead acetates 

i4T Aluminum;Hydrides$Aluminum 
hydrides 

Code. as: ^1 Acetates; Lead compounds$Lead acetates 

t5T Aluminum compounds;Hydrides$Aluminuin hydrides 

12. A single key may be found going to different posting terms in 
nonadjacent entries on the computer printout. Watch for these 
and examine them carefully. If possible, choose one posting term 
and delete the other. If both posting terms seem to be 
reasonable translations, list both NASA terms in one table, 
following' each term with a question mark and separating them with 
a comma; then delete the other table. 

The automatically generated entries will appear as follows: 
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|6T Acoustics;Stability$Acoustic instability 
Acoustics ;Nozzles$Acoustic nozzles 
Acoustics ;Stability$Frequency stability 
Acoustics ;NozzTes$Sonic nozzles 

The analyst's revision results in the following two entries to 
replace the four automatically generated entries: 

Acoustics;Stability$Acoustic instability?, Frequency 
stability? c- 
HT Acoustics;Nozzles$Acoustic nozzles?, Sonic nozzles? 

13. When a printed table suggests another table which you wish to 
add, be sure that the DTIC and NASA concepts are the same. If 
necessary, add a second NASA term to achieve tKis. 

Computer generated entry: 

|6T Additives;Propellants$Propenant additives 

add entries: 

1A )4T Additives ;Rocket propel lants$Propell ant additives. 



rocket propel 1 ants 
)4T Additives ;SoVid propel lants$Propell ant additives. 
Solid propel 1 ants 

Don't construct a table going to a list of NASA terms if the 
separate elements of the key already are or should be posted 
individually to ±he same terms you intend to list. 

Examples of tables which should not be constructed: 
)4T Machinery.;Performance tests$Machinery, Performance 
tests 

because DTIC's term "Machinery" goes to NASA's term "Machinery" 
and DTIC's term "Performance tests" goes to NASA's term 
"Performance tests" (followed by a question mark). Another table 
which should not be constructed is: 

)4T Aircraft ;Composite s.tructures;Constructior.$Aircraff 
structures, Composite structures 

because DTIC terms "Aircraft" and "Construction" translate to the 
NASA term "Aircraft structures" while "Composite structures" 
translates to "Composite structures". 



B-10 



103 



REFERENCES 



1.- Klingbiel, Paul H., "Phrase Structure Rewrite Systems in Information 
Retrieval." Information Processing and Management (to be published). 



2. 



Geyrter, William B., An Overview of Artificial intelligence and Rob otics. 
!!i] rL JJ.^ intelligence. Part 6: Applications . 

iQo? National Aeronautics and Space Administration, October 



ERIC - ^ 



1. Npon Na ' X (kfmnmnx Accmon No. 

NASA CR-3838 


3. fladpi^nt't Catalof No. 


4. Titit M>d Subtitlf 

An Operational System for Subjedt Switching Between 
Controlled Vocabularies: A Computational Linguistics 
Approach 


5. fltport Oatt 

October 1984 


6. farforming Organiiation Coda 


7. Authof(t) / 

June^^/fr"5TTv^ter, Roxanne Newton, and Paul H» Klingbiel 


8. fcrfonming Organfution Rapoa'No. 


10. Work Unit No. 


9. Pirformi'nq Orgtniation Hum and Addrtsi 

Planning Research Corporation 
Government Information Systems 
1500 Planning Research Drive 
McLean, VA 22l02 


11. Contract or Grant No. 

NASW-3330 


13. Typa of fltport and Pariod Covarad 

Contractor Report 

Hoy, ,?, ,iqfiU nar. .11, 1981 


1Z Spoiiiorii>9 A9incy tUirm and Addrw 

National Aeronautics and Space Administration 
Washington, D^C, 20546 


14. .>poniorin9 Agancy Coda 
NIT-2 


1& SuppitmamafY Horn 

Reference: Technical Directive 83-130 



Abctna 



The NASA Lexical Dictionary (NLD), a system that automatically translates input subject 
terms to thc^se of NASA, was developed in four phases.. Phase One provided Phrase 
Matching, a contejjt sensitive word-matching process that matches input phrase words with 
any NASA Thesauru? posting (i.e. index) term or Use reference. Other Use references have 
been added to enable the matching of synonyms, variant spellings, and some words with the 
same root. Phase Two provided the capability of translating any individual DUG term to 
one or more NASA terms having the same meaning. Phase Three provided NASA terms having 
equivalent concepts for two or more OTIC teras, i.e. coordinations of DTIC terms. Phase 
Four was concerned with indexer feedback and maintenapce. Although the originaT NLD 
construction involved much manual data entry, ways' were found to automate nearly all but 
the intellectual decision-making processes. ' In addition to finding improved ways to 
construct a lexical dictionary, new applications for the NLD have been found and are 
being developed. . * , * « 



tr: Kay Wocdi (Suvfanad by Aitthordl) 

Translating} Machine translation; 
Information systems; Information theory; 
Linguistics; Words (language); Computer 
programming; Information retrieval 
Semantics; Computer techniques; Terminology 


It. OittHtMidon Statamam 

Unclassified - Unlimited 

Subject Category 82 


tit Sacuncy Caaif . lof thia rfoortt 

Unclassified 


20. Sacurity Cmmt (of thl< pagt) 

Unclassified 


21. No. of ^afat 

96 


2i Wca* 

AOS 



ERIC 



VorsiMibr ThoNatiOfiiiTtdmieil lnfbrfiMfiOfiS«vte«.SoHn«fiaM. Viffinia 221C1 NASA-Langlty. 1984 



105 



